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Automatic Synthesis of 
Implementations for Abstract Data Types 
from Algebraic Specifications 


Abstract 


- Algebraic specifications have been used extensively to prove properties of abstract data types 


r 


and to establish the correctness of implementations of data types. This thesis explores an 
automatic method of synthesizing’ implementations for data types from their algebraic 
specificat‘ons. 


The inputs to the synthesis procedure consist of a specification for the implemented type, a 
specification for each of the implementing types, and a formal description of the 
representation scheme to be used by the implementation. The output of the procedure 
consists of an implementation for each of the operations of the implemented type in a simple 
applicative language. 


The inputs and the output of the synthesis procedure are precisely characterized. A formal 
basis for the method employed by the procedure is developed. The method is based on the 
principle of reversing the technique of proving the correctness of an implementation of a data 
type. The restrictions on the inputs, and the conditions under wliich the procedure 
synthesizes an implementation successfully are formally characterized. 
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+1. Introduction is 
ol Goals of the Thesis 


This thesis is concerned with the problem of automatic synthesis of implementations 

' for abstract data types from their algebraic specifications. The inputs to the synthesis 

procedure include (i) a formal specification of the data type to be implemented, (ii) a formal 

specification of each of the implementing types. and (iii) a formal description of the 

"representation scheme to be used by the desired implementation. The output consists of an 

implementation for each of the operations of the implemented type. The inputs are specified 
using an algebraic specification technique [14, 18, 25]. 


The thesis has three main goals: 
(1) To precisely characterize both the inputs of the synthesis procedure, and the output. 
(2) To devise an automatic method of deriving the output from the inputs. 
(3) To provide a formal basis ior the method. 


The method of derivation is described in terms of a set of synthesis rules. The 
output is derived by invoking the synthesis rules a finite number of times. The thesis 
describes how the synthesis rules are used in deriving a suitable implementation. 

The purpose of providing a formal basis for the method is to justify the correctness 
of the implementations derived by the synthesis procedure. The formal basis also helps in 


characterizing the scope of the synthesis procedure. 
1.2 Motivation for The Research 


The reliability of computer sofeware has received a great deal of attention in recent 
years. Rapid advances in hardware technology have dramatically decreased the cost of 
hardware relative to software. As a result, the cost of producing and maintaining software has 
become a major concern. An effective way of improving the reliability and the cost of 
software simultaneously is to find methods to decrease the effort required to produce correct 


software. At present, active research is underway [43] in exploring this avenue. Several 


‘approaches have been proposed, each of which can be put under one of the following three 
‘ categories based on the degree of automation it offers: manual approaches, semi-automatic 
: approaches, and automatic approaches. 

The manual approach advocates discipline in human programming (31, 11, 41]. It 

_ consists of identifying new mechanisms of abstractions [32] that encourage the advocated 

i discipline. The most significant contribution of this approach has been the inducement of a 

- change in the attitude of programmers towards the style of programming. Concrete 
manifestations of this change include the birth of the concept of abstract data types, and the 
development of new languages [34, 29, 52] to support data types. 

The goal of the semi-automatic approach is to seek machine help to establish the 
correctness of programs written by the user. Formal methods are developed to specify and 
verify properties of pieces of software [13, 12, 20]; systems are built to carry out verification 

- automatically or semi-automatically [27, 15]. A variant of the verification method is the 
programmer's apprentice method [19]. The programmer's apprentice provides an interactive 
programming environment built up by a set of tools which helps the programmer in 
preparing and checking his work in several ways. The tools range from simple editors to 
more sophisticated ones that can analyze and criticize a user’s program during the various 
phases of programming. -Yet another way of providing partial machine help is to build 
systems [2, 3, 48] that will help apply transformation rules chosen from a catalogue of 
equivalence preserving transformations. The programmer can refine or improve the 
efficiency of his programs by judiciously choosing the appropriate rules from the catalogue. 

The automatic approach, under which our research falls, seeks to automate a part or 
all of the programming process itself. Its goal is to generate code for programs from their 
high-level declarative descriptions, thereby relieving the programmer of having to worry 
about error-prone, low-level details of programming. Though this may one day be feasible, 
experience [1, 36] in the last few years shows that not nearly enough is known about the 
process to automate it completely. Two remedies have been used with some success to break 
the stalemate in the situation: The first is to restrict the domain for which programs are being 
synthesized [4]; the second is to expect the user to furnish more information about the desired 


properties of the program [6] to guide the synthesis procedure. 
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_A third course of action :that has not so far been employed in earnest is to 
-complement the automatic approach with recent advances in programming methodology. 
. (Bauer, et.al., [3] have employed this idea with the semi-automatic approach.) In particular, 
the idea of designing software as a hierarchy of abstractions can be used to aid the synthesis 
. procedure. Such a hierarchical design for the program reduces the amount of refinement 
. required to be performed by the synthesizer at each step. 

The thesis takes into consideration all the factors mentioned above. Within the 

: general area of programming, we restrict. ourselves to the study of synthesis of 
_ implementations for abstract data types. We believe that the synthesis of implementations for 
abstract data types is amenable to automation because the specification techniques for data 
types have been extensively studied, and hence, are better understood. We also expect 
additional information about the implementation to be furnished by the user. This 
information is provided in the form of a description of the representation scheme to be used 


by the implementation. 
1.3 Related Work 


The works rciated to ours lie partly in the area of general program synthesis and 
partly in the area of automatic implementation of data structures. 

In the general area of synthesis, the work most closely related to ours is that of 
Darlington [8, 9]. He has developed a system that uses a set of transformation rules to 
improve semi-automatically the efficiency of recursive programs and also to construct new 
recursive programs. Recently, he has also applied the transformation rules’ to synthesize 
implementations for data types [7]. The synthesis rules developed in the thesis are closely 
related to his, The difference lies in the method in which the synthesis rules are used to 
synthesize implementations. Our method is based on verification techniques of data types. 
Our work has two advantages over his. Firstly, the class of implementations derived by our 
method is larger than his. This is because we develop more ways of using the synthesis rules 
for deriving implementations. Secondly, we formally characterize the conditions under which 


the synthesis rules yield a correct implementation for data types. 


Als 


The ZAP system [30] of Feather’s is a program transformation system in which the 
basic rules of manipulations are similar to our synthesis rules. His work is different from ours 
in two ways. Firstly, he is concerned iwith developing higher level stretegies to apply the basic 
» tranformation rules (in general, any equivalence preserving rules) for the construction of 
large-sized programs. Secondly, his approach is less automatic than ours. The emphasis in 
the design of ZAP is to use "metaprograms" to improve communication between the user and 
the system. There are two inputs to ZAP: the specification of the program to be constructed 
and a metaprogram which consists of a sequence of commands that direct the transformation 
process. The metaprogram expresses the higher level strategy to be used in applying the 
‘tranformation rules, 

Within the area of automatic implementations for data structures, the work of 
Okrent [40] has goals closest to ours, Okrent’s method uses only the algebraic specifications 
of the data types involved as inputs. Because of the lack of information about the desired 
representation scheme, the implementations generated by his synthesis procedure are not as 
interesting as the ones generated by ours. He limits severely the range of the data types 
acceptable as inputs. He also concentrates on a fixed set of target structures such as 
contiguous memory and heap memory for the implementations. 

Another work in this area that is related to ours is that of Subrahmanyam’s [50]. 
Subrahmanyam’s method like Okrent’s does not use any information about the 
representation scheme. His method has a provision for the user to specify performance 
constraints on the desired implementation. The method is based on partitioning the 
operation set of the data type into a kernel set and a nonkernel set. Implementations for the 
kernel operations are derived by identifying pairs of functions (on the representation type) 
called retrievable insertion function pairs. Implementations for the nonkernel operations are 
derived in terms of the implementations for the kernel operations so as to. meet the 
performance constraints. 

Most of the other research in the automatic generation of data structure 
implementations has been concerned with the automatic selection of an optimal 
representation for data structures. Rowe and Tonge [47], Rovner [46], and Tompa and 
Gotlieb [51] have studied optimization problems for a language containing a fixed set of high 


AO 


level data structures. First they build a library of possible implementations for each fixed 
high level data structure in the language, along with a parameterized description of the 
’ performance of each library entry. Then they proceed to select the "best" implementation for 
‘ each instance of the data structure, by making a flow analysis of the program that uses the 
data structure. The goal of our work is to derive an implementation for a given 
representation rather than to select an optimal one among a given set of representations. 
Standish, et.al., [49], Bauer, et.al., [3], and Wile, et.al, [2] have developed catalogues 
of equivalence preserving transformation rules as a part of program development systems. 
The programmer can refine or improve the efficiency of his programs by instructing the 
system to apply appropriate transformation rules on the programs. None of these works, 
however, deals explicitly with the implementation of data types. It is possible, with some 


modifications, to incorporate our synthesis rules as a part of their system. 
1.4 Organization of the Thesis 


The next chapter gives an overview of the synthesis procedure. The third chapter 
describes in detail the inputs of the synthesis procedure, and formalizes the restrictions on the 
inputs. The synthesis procedure derives an implementation in two stages: The 
implementation is first derived in a preliminary form which is then transformed into a final 
form. The first stage of the procedure is the topic of the fourth and the fifth chapters. The 
sixth chapter describes the second stage. The last chapter gives the concluding remarks. 


Be 


‘2. An Overview of the Synthesis Procedure 


This chapter gives an overview of the synthesis procedure. The first section gives a 
scenario of the synthesis procedure from a user’s point of view. It briefly describes the form 
of the inputs to the synthesis procedure, and the form of its outputs via an example, The 
second section gives a summary of the synthesis procedure. It points out the nontrivial issues 
_ involved in the method employed by the procedure for deriving an implementation. The last 


. Section describes the scope of the procedure. . 
2.1 The User’s View 


Consider the following scenario involving a programmer. The programmer has 
designed an abstract data type (the implemented type) to be used in solving one of his 
programming problems. He is now seeking the help of a system for implementing the type 
using another data type, called the representation type; The representation type is chosen by 
the user himself. Furthermore, he is willing to furnish information about how he wants the 
values of the representation type to’ be used in representing the values of the implemented 
type. The system is expected to generate automatically (or with some help from the user) an 
implementation for the implemented type that uses the representation type as the 
representation in a manner consistent with that suggested by the user. 


Viewed as a black box, the inputs to the procedure are: 
(i) A specification of the implemented type, 


(ii) a specification of the representation type, and specifications of all the types used in 
the specification of the representation type. We refer to the representation type, and 
all the types its specification uses as the implementing types. 


(iii) an association specification that describes how the values of the representation type 
are to be used in representing the values of the implemented type; this corresponds 
to the representation (or abstraction) function defined by Hoare in [21]. 


The output of the synthesis procedure consists of an implementation for each of the 
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_ operations of the implemented type in terms of the operations of the implementing types. To 
_ get a better idea about the inputs and the output, let us consider an example of deriving an 
- implementation for the data type Queue_Int in terms of Circ_List- Queue_Int is a 
first-in-first-out queue of integers. Elements are added to a queue at the rear end, and 
removed from the front end. Circ_List is a list of integers. Elements are inserted into and 
removed from a list at the same end, which is the rear end of the list. The operation that gives. 
Circ_List a circular character is Rotate. Rotate moves every element in a list by one position 
towards the rear end in a cyclic fashion, i.e., the element at the rear end is moved to the front 
end. 

In this example, the implemented type is Queue_Int and the representation type is 
Cire_List. Circ_List uses (this notion is defined precisely in the next chapter) the data types 
Integer and Bool, so the implementing types include Circ_List, Integer, and Bool. Figures 1, 
2, and 3 give the inputs to the synthesis procedure. (The figures also give an informal 
description of the operations of the data types.) Specifications of Integer and Bool should 
also be given as inputs, although we have not shown them here. The language used to express 
the data type specifications is equational, similar to the ones developed in [14, 18, 25]. One of 
the crucial differences is the following: We assume that the specification of every data type 
identifies a basis for the data type. A basis is a minimal set of operations of the data type that 
can be used to generate all the values of the type. The operations in the basis are called the 
generators of the type. For example, the operations Create and Insert can be the generators 
for Cire_List. The specification language is described in the next chapter. 

Fig. 3 gives the association specification for the implementation to be derived. It 
characterizes the representation scheme to be used by the implementation. The association 
specification is expressed in two parts. The first part specifies the invariant 3. 5 is a predicate 
that specifies the set of values that may be used to represent the values of the implemented 
type; only those values of the representation type for which J is True may be used to 
represent the values of the implemented type. In the present example, J is True for all values 
of Cire_List. The second part specifies the abstraction function A; A maps a value the 
representation type to the value of the implemented type that the former may represent. In 


the present example A specifies the following mapping: The empty queue is represented by 
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- Fig. 1. Specification of Queue_Int 
Queue_Int is Nullq, Enqueue, Front, Dequcue, Append, Size 


| Defining Types 


Bool, Int 


Operations 


. Nullq :-> Queue_Int 

Enqueue : Queue_Int X Int -> Queue_Int 

Front * Queuc_Int -> Int U { ERROR } 
Dequeue : Queue_Int -> Queue_Int U { ERROR } 
Append =; Queue_Int X Queue_Int -> Queue_Int 
Size : Queue_Int -> Int 


Comment. 

Queuc_Int is a FIFO queue of integers. Nullq constructs the empty queue. Enqucue adds an element to 
a qucue at the rear end. Dequeue removes the element at the front of a queue. Front returns the 
element at the front of a queue. Append joins two queues adding the elements of the second argument 
at the rear of the first argument. Size computes the number of elements in a queue. 


Basis 


{ Nullq, Enqueue } 


Axioms 


(1) Front(Nullq) = ERROR 
(2) Front(Enqueue{Nullq, e)) = e 
(3) Front(Enqueue(Enqucue(q, e1), e2)) = Front(Enqueue(q, e1)) 


(4) Dequeue(Nullq) = ERROR 
(5) Dequene(Enqueue(Nullq, e)) = Nullq 
(6) Dequeue(Enqueue(Enqueue(q, e1), e2)) = Enqueue(Dequeue(Enqueue(q, e1)), e2) 


(10) Append(q, Nullq) = q ; 
(11) Append(q1, Enqueuc(q2, e2)) = Enqueue(Append(ql, q2), e2) 


(12) Size(Nullq) = 0 
(13) Size(Enqueue(q, e)) = Size(q) + 1 
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: Fig. 2. Specification of Cire_List 
’ Circ_List is Create, Insert, Value, Remove, Rotate, Empty, Join 


Defining Types 
Integer, Boolean 


Operations 


Create —: -> Circ_List 

Insert =: Circ_List X Integer -> Cire_List 
Value —: Circ_List -> Integer U { ERROR } 
Remove =: Circ_List -> Cire_List U { ERROR } 
’ Rotate =: Circ_List -> Circ_List 

Empty =: Circ_List -> Boolean 

Join : Circ_list X Cire_list -> Cire_list 


Comment 

Circ_List is a list of integers with a front end and a rear end. Create constructs an empty list; the front 
and the rear ends of an empty list are the same. Insert inserts an clement into a list at the rear end. 
Value returns the element at the rear end of a list. Remove removes the element at the rear end from a 
list. Rotate moves every element in a list by one position towards the rear end in a cyclic fashion, i.e., 
the element at the rear is moved to the front. Empty checks if a list is empty. Join joins two lists by 
positioning the first argument in front of the second. 


_ Basis te 
{Create, Insert} 


Axioms 


(1) Value(Create) = ERROR 
(2) Value(Insert(c, i)) = i 


(3) Remove(Create) = ERROR 
(4) Remove(Insert(c, i)) = c 


(5) Rotate(Create) = Create 
(6) Rotate(Insert(Create, i)) = Insert(Create, i) 
(7) Rotate(Insert(Insert(c, i1), i2))) = Insert(Rotate(Insert(c, i2)), i1) 


(8) Empty(Create) = true 
(9) Empty(Insert(c, i)) = false 


(10) Join(c, Create) = c 
(11) Join(c, Insert(d, i)) = Insert(Join(c, d), i) 
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Fig. 3. Association Specification 
’ Invariant 


' $(c) = True 


Abstraction Function 


A(Create) = Nullg 
A(Insert(c, i)) = add_at_head(A(c), i) | 


add_at_head(Nullq, i) = Enqueue(Nullg, i) 
add_at_head(Enqueuc(q, i), il) = Enqueuc{add_at_head(q, il), i) 


the empty list. A nonempty queue is represented bya list whose elements are identical to the 
ones in the queue, but are arranged in the reverse order. The motivation for this 
representation scheme is that” reading and‘ deletion of elements froma queue can be. 
performed efficiently. Note that the specification of A uses an auxiliary function 
Add_at_head on Queue_Int; the auxiliary function adds an element at the front end of a 
queue. | 

Fig. 4 shows the output of the synthesis procedure. The output defines a set of 
functions, called the implementing functions, on Circ_List. Every implementing function 
implements an operation of Queue_Int. The implementing function implementing the 


operation f is given the name F. For instance, NULLQ implements Nullg. The target 


Fig. 4. An Implementation 
NULLQ()::= Create() 


ENQUEUE(c, j)::= Rotate(Insert(c, j)): 
FRONT(c) ::= Value(c) 

DEQUEUE(c) :: = Remove(c) 

APPEND(c, d) :: = Join(d, c) 


SIZE(c).:: = if Empty(c) then 0 
else SIZE(Remove(c)) +1 
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« language used to express the implementations for the operations is a simple applicative 


language. The only mechanisms available in the language to build programs are: functional 


: composition, conditional expressions, and recursive function definition. The language uses a 


‘ method of defining function that is customarily used in applicative languages like pure LISP 


: ¥ 


= 


[37]. A function F is defined using the following schema: F(v,. +--+ vy) s1= e, where 


peers, are variables, and e is an expression containing those variables. A function 


definition may use the operations of the implementing types as base functions, 


2.2 A Summary of the Synthesis Procedure 


The synthesis procedure is summarized in an illustrative fashion using the example 
already introduced. This is done in the first two subsections, In the example introduced, the 
invariant J is a trivial one: It is True on all values. In the third subsection, we highlight the 
issues involved in deriving an implementation in the presence of a nontrivial invariant by 
introducing a new example. | 

The method used by the procedure to derive an implementation is based on treating 
every equation in the specifications as a rewrite rule.2 The procedure begins by combining all 
the input specification: into a rewriting system called the Initial World (IW). (IW is obtained 
by simply replacing the symbol = by — in the input specifications.) The procedure assumes 


. that IW satisfies the uniform termination property as well as the unique termination property. 


(IW is said to be convergent in such a case. This is similar to the Church-Rosser property.) 
The uniform termination property ensures that every chain of reductions starting from an 
expression terminates. The unique termination property ensures that all chains of reductions 
starting from an expression terminate in the same expression.. These two properties ensure 
that the equivalence relation characterized by a specification can be computed by using the 


rules in IW for reducing expressions. The procedure also assumes that there is a predefined 


2. A rewrite rule (written a — 8) is an ordered pair-a left hand side and a right hand side - of 
expressions. A rewrite rule can be used to reduce any expression that is an instance of the left hand 
side into an expression that is an instance of the right hand side. A rewriting system is a set of rewrite 
Tules. 


-19- 


termination ordering (>) on expressions which can be used for showing the uniform 
: termination property of rewriting systems. . | 

The synthesis procedure derives the implementation in two stages, In the first stage 
: the procedure derives the implementation in an intermediate form. The intermediate form is 
called a preliminary implementation. In the second stage the preliminary implementation is 
. transformed into an implementation in the target language (target implementation). Fig. 5 
. gives a preliminary implementation for Queue_Int that is consistent with the association 
specification given in Fig.3. There are two crucial differences between a preliminary 
implementation and a target implementation. The first one concerns the methods used for 
defining the implementing functions. A preliminary implementation defines a function as a 
set of rewrite rules. The rewrite rules defining an implementing function F are the ones that 
have F as the outermost symbol on their left hand side. For instance, rules (2) and (3) in 
Fig.5 define ENQUEUE. The second difference is that the only operations of the 
representation type that are permitted to appear in a preliminary implementation are its 
generators, A target implementation is permitted to use all the operations of the 
representation type. In the example under consideration, for instance, a preliminary 


implementation may use all the operations of Integer and Bool, but only the generators 


Fig. 5. A Preliminary Implementation 


(1) NULLQ(Q) — Create() 


(2) ENQUEUE(C reate, j) —+ Insert(Create, j) 
(3) ENQUEUE(Insert(c, i), j) + Insert(ENQUEUE(c, j), i) 


(4) FRONT(Create) — ERROR 
(5) FRONT(insert(c, i)) I 


(6) DEQUEUE(Create) + ERROR 
(7) DEQUEUE(Insert(c,i)) — ¢ 


(8) APPEND(c, Create) — c 
(9) APPEND(c, Insert(d, 1)) - APPEND(ENQUEUE(c, i), d) 


(10) SIZE(Create) + 0 
(11) SIZE(insert(c, i)} + SIZE(c) +1 


-20- 


4 (Create, and Insert) of Circ_List. 
There are two reasons for the decomposition. Firstly, it makes the synthesis 
: procedure more modular. Target language dependent transformations are separated from the 
: language independent transformations. The decomposition also lends itself naturally to 
deferring efficiency improving transformations to the later stage. In the first stage one can 
concentrate on deriving a simple correct implementation. Secondly, the decomposition 
reduces the complexity of the structure of synthesis procedure. The first stage deals with the 
techniques for deriving an implementation from the specification of the data type. The 
second stage deals with the techniques for deriving alternate forms of implementations from 
an preliminary implementation. The decomposition provides a better insight into the 
synthesis method, and simplifies the description of the synthesis procedure. The next two 


subsections give an overview of the two stages of the synthesis procedure. 
2.2.1 Stage 1: Preliminary Implementation Derivation 


A preliminary implementation of a data type is correct with respect to an abstract 
function .A if the following condition holds: Every implementing function F (that implements 
the operation f) defined by the preliminary implementation is a total function on the 
representation values . that the homomorphism property 36(F(x)) = f(36(x)) holds. Here 36 
is a function on the values of the implementing types; 36 behaves exactly like the abstraction 
function .4 on the representation values, and like an identity function on all other values. The 
synthesis procedure derives a preliminary implementation so that the above criterion of 
correctness is satisfied. 

The procedure synthesizes the preliminary implementation for one operation at a 
time by deriving a separate set of rewrite rules for every operation. Since the method used is 
the same for every operation, we illustrate the synthesis of only a couple of operations, The 
procedure first determines the left hand sides of all the rules of the preliminary 


implementation. Then, it determines a suitable right hand side for each of the rules. 
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: 2.2.1.1 Determining the Left Hand Side 


One of the correctness requirements of a preliminary implementation is that it must 


- define a total function on the representation type. This requirement is ensured by deriving 


the rules of the preliminary implementation so that (1) they satisfy the uniform termination 


' property, and (2) they are well-spanned. The first property is ensured while deriving the right 


. hand side of the rules. The second property is used to determine the left hand sides. 


The second property requires the léft hand side expressions of the rules to be of a 
particular form. For instance, any pair of rules that have the form given below constitute a 
well-spanned set of rules for ENQUEUE. (In the following rhs, and ?rhs, are used as place 
holders for expressions to be determined later.) 

ENQUEUE(Create, j) — ?rhs, 

ENQUEUK(Insert(c, i), j) — rhs, 

Note that the left hand side of each of the above rules consists of ENQUEUE 
applied to arguments that are generator expressions.> The set of arguments, i.e., sequences of 
generator expressions, to ENQUEUE on the left hand side of the rules is 
ArgsSet = {<Create, >, <Insert(c, i),j>}. ArgsSet spans the set of all ordered pairs of 
generator constants. In other words, every pair of generator constants is an instance of one of 
the arguments in ArgsSet. This property ensures that the definition of ENQUEUE accounts 


for all the representation values. It is easy to build a procedure that automatically generates a 


- well-spanned ArgsSet, once the generators of the representation type are identified. Thus, an 


appropriate set of left hand sides for the rewrite rules to be derived can be determined 


‘ automatically. 


3. A generator expression is an exnression in which the only function symbols involved are the 
generators. A generator constant is a generator expression that does not contain any variables. 
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2.2.1.2 Determining the Right Hand Side 


The right hand sides of the rules are determined so that the preliminary 
: implementation satisfies the homomorphism property mentioned earlier, For this, the Initial 
World, IW, is first supplemented with a set of rules, called the 36-rules. The ‘6-rules express 


the homomorphism property; there is an 36_rule for every implementing function. For 


instance, the Jé-rule —— corresponding to ENQUEUE is 
36(ENQUEUK(c, j)) + Enqueue(36(c), 36(j)). Let us call the supplemented system the 
Perturbed World (pw).4 


The Perturbed World (PW) is then used to derive a set of synthesis equations, one 
equation for every rule in the preliminary implementation. The right hand side of a rule is 
determined from the right hand side of the corresponding synthesis equation. For instance, 
the synthesis equation corresponding to the rule ENQUEUE{Insert(c, i), j) > ?rhs, is an 
equation of the form 4(ENQUEUK(Insert(c, i), j)) = 36(2rhs,) that satisfies the following 


conditions: 
(1) 3%6(ENQUEUE(nsert(c, i), j)) = i6(?rhs,) is a theorem of PW 
(2) ENQUEUE{Insert(c, i), j) > ?rhs, - 


(3) rhs, contains only the permitted operations of the implementing types, and the 


implementing functions. 


The Synthesis Theorem in chapter 4 shows that, when a preliminary 
implementation is well-spanned, the preliminary implementation satisfies the 
homomorphism property if the synthesis equation corresponding to each of the rules in the 
preliminary implementation is a theorem of PW. Note that the second condition above 
ensures that the rewrite rules derived satisfy the uniform termination property. The third 


condition ensures the syntactic correctness of the preliminary implementation. 


4. Note that since 4 is a function that behaves essentially like A, the rewrite rules specifying it in PW 
are obtained by simply replacing A by 36 in the asociation specifcation. 
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2.2.1.3 Deriving the Synthesis Equations 


| Every synthesis equation of the preliminary implementation is derived with the help 
, of two inference rules called the synthesis rules. The synthesis rules are designed for 
generating theorems of PW that have the same left hand sides, but different right hand sides. 
For deriving a synthesis equation, the synthesis rules are invoked repeatedly a finite number 
of times to generate a series of theorems until the desired equation is generated. For instance, 
the synthesis equation corresponding to the rule ENQUEUE(Insert(c, i), j) + ?rhs, is derived 
by generating a series of theorems that have 4(ENQUEUE(Insert(c, i), j)) as their left hand 
side. The generation continues until a theorem whose right hand side qualifies the theorem 
to be a synthesis equation is encountered. 

The idea used for generating an equation is to reverse the method of demonstrating 
that such an equation is a theorem of PW. The central notion used in the generation is a 


5 is the opposite of reduction. It‘is the act of 


mechanism called expansion. Expansion 
applying a rewrite rule to an expression from right to left. 

For example, consider the rule 36(ENQUEUE(e, j)) + Enqueue(J6(c), 36(j)), and 
the expression Add_at_head(Enqueue(36(Create), 36(i)), k). The subexpression 
Enqueue(36(Create), 36(i)) is an instance of the right hand side of the rule for the substitution 
-{cr+Create,jrri}. The corresponding instance of the left hand side is 
‘36(ENQUEUE(Create, i)). Therefore, Add_at_head(Enqueue(36(Create), 36(i)), k) expands to 
Add_at_head(36(ENQUEUE(Create, i)), k) by the rule. 

The first synthesis rule specifies a way of generating a theorem from an expression 
with that expression as the left hand side. In the following e) denotes the normal form of e 
obtained using Pw. (The normal form of e is the result of reducing it using the rewrite rules 


of PW until it becomes irreducible.) - 


5. The definition of expansion will be revised later in chapter 4 to make it more general. According to 
the definition given here, expansion is identical to the transformation technique folding used by 
Darlington [7] for synthesis of recursive programs. 

6. PW is a convergent system. Therefore, every expression is guaranteed to have a unique normal 
form. 
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: e is an expression 
le 1: € 1S an expression 
Rule e=etl 


The second synthesis rule specifies how to generate a theorem from an existing one 
so that the new theorem has the same left hand side as the old one. In the following 
. expand(e,) denotes any expression that is an expansion of e, using some rewrite rule of PW. 


€, =e, 


Rule 2: e, = expand(e,) 


We investigate two methods in which the synthesis rules can be used for deriving a 
synthesis equation. The first method derives synthesis equations that are in the equational 
theory of PW. The second method derives equations that are in the inductive theory. The 
second method is more general than the first one. A system that implements the synthesis 
procedure would, therefore, use only the second method. We discuss them separately for 


pedagogic reasons. 
2.2.1.3.1 Derivation in the Equational Theory 


As an illustration, let us derive a synthesis equation that is of the form 
46(ENQUEUE(insert(c, i), j)) = 36(2rhs,). | The equation is derived by generating a series of 
theorems that have 6(ENQUEUE(Insert(c, i), j)) as their left hand side. The generation is 
begun by invoking synthesis rule(1) on the left ‘hand side expression. The rest of the 
theorems in the series are generated by invoking synthesis rule (2) using the rewrite rules of 
PW for expansion. The rewrite rules for expansion are chosen with the following ultimate 
goal: Obtain a right hand side that has the form 3(?hs,) so that 
~ 3(ENQUEUL(Insert(c, i), j)) > 36(?rhs,), and rhs, contains only the implementing 
functions and the permitted operations of the implementing types. In the illustration given 
below, the generation of every theorem in the series is considered as a step. At each step, the 
expression expanded, and the rewrite rule used for expansion are indicated. The relevant 
rewrite rules of PW that are going to be used for expansion are listed at the beginning. 
Rule (1) is the 3-rule coresponding to Enqueue; rules (2) through (5) are obtained from the 


association specification. 
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Relevant Rewrite Rules of the Perturbed World 
(1) KCENQUEUE(e, j)) + Enqueue(36(c), 36(j)) 


(2) 6(Create) + Nullq 
' (3) 36(Insert(c, )) > Add_at_head(36(c), 36(i)) 


(4) Add_at_head(Nullq, i) + Enqueue(Nullq, i) 
(5) Add_at_head(Enqueue(q, i), j) + Enqueue(Add_at_head(q, j), i) 


Form of the theorem to be generated: J6(ENQUEUE(nsert(c, i), j) = 36(?rhs,) 
Normal form of 4(ENQUEUE(nsert(c, i), j)): Enqueue(Add_at_head(36(c), J6(i)), 36@)) 
Rules used for the normal form: (1), (3) - 


Step (1) Invoke Synthesis Rule (1) on J6(ENQUEUE(Insert(c, i), j)) 
36(ENQUEUE(Insert(c, i), j)) = Enqueue(Add_at_head(36(c), 36(i)), 36@)) 


Step (2) Expand Expression: Enqueuc(Add_at_head(36(c), 36(i)), 36(j)) 
Using Rule: (5) 


3(ENQUEUE(nsert(c, i), j)) = Add_at_head(Enqueue(36(c), 36(j)), 36()) 


Step (3) Expand Expression: Enqueue(}6(c), 16(i)) 
Using Rule: (1) 


4%6(ENQUEUE(Insert(c, i), j)) = Add_at_head(36(ENQUEUE(, j)), 36(i) 


Step (4) Expand Expression: Add_at_head(36(ENQUEUK(c, j)), 36(i)) 
Using Rule: (3) 


COR ORE AEs FRE RRS ESOS SESE ESSERE EHS OEE EESS EERE SSE ESRAReR TREE EEE 


4(ENQUEUE(Insert(c, i), j)) = H(Insert(ENQUEUE(, }), i)) 


The theorem generated in step (4) qualifies to be a synthesis equation. Hence the desired rule of the 
. preliminary implementation is: 


ENQUEUL(nsert(c, i), j) + Insert(ENQUEUE(c, j), i) 


One can similarly generate a theorem of the form 1(ENQUEUE(Create, j)} = 1(nsert(Create, j)), 
which gives rise to the following rewrite rule to complete the preliminary implementation for 
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~ ENQUEUE: 
ENQUEUE(Create, j) + Insert(Create, j) 


: 2.2.1.3.2 Derivation in the Inductive Theory 


The method used for deriving a synthesis equation in the inductive theory is based 
on the following property that every theorem of PW satisfies: If an equation is a theorem of 
PW, then every instance of it is in the equational theory of PW. An instance of an equation 
e 


e, is an equation obtained by replacing every variable in e, and e, by generator 


gaa 
constants, ’ 

We, therefore, take the following approach for deriving an equation in the inductive 
theory. First derive an instance of the desired equation; the method of derivation described 
earlier can be used for this purpose. The instance of the equation derived should be such that 
a generalization of it has the form of the desired synthesis equation, and is a theorem of PW. 
A generalization of e, = e, 
e, by suitable variables. To check if the generalization is a theorem of PW, we use an 


is an equation obtained by replacing assorted constants in e, and 


automatic procedure called is-an-inductive-theorem-of. The procedure is an extension of the 
method of using the Knuth-Bendix completion algorithm for proving inductive properties of 
convergent rewriting systems [28, 38, 22]. The procedure is described in chapter 4. 

As an illustration let us derive a synthesis equation of the form 
36(APPEND(c, Insert(d,i ))) = 36(@rhs,) which gives rise to one of rules in the preliminary 
implementation of Append. We begin by deriving an instance determined by the replacement 


of the variable d by the constant Create, and then apply generalization. 


Relevant Rewrite Rules of the Perturbed World 


(10) Append(q, Nullq) — q 
(14) 36(Create) — Nullq 


7. A generator constant is an expression formed out of generators, and does not contain any variables. 


OP 


. ~ (20) H(ENQUEUE(C, i)) — Enqueue(36(c), 36(i)) 
. (22) 6(APPEND(c, d)) — Append(3€(c), 36(d)) 


: Form of the theorem to be generated: #6(APPEND(c, Insert(Create, i))) = 16(?e) 


- Normal 


form of 36(APPEND(c, Insert(Create, i))): Enqueue(36(c), J6(i)) 


: Rules used for the normal form: 


Step (1) 


Step (2) 


Step (3) 


Invoke Synthesis Rule (1) on 36(APPEND(c, Insert(Create, i))) 
36(APPEND(c, Insert(Create, i))) = Enqueuc(36(c), J6(i)) 


Expand Expression: }6(APPEND(c, Insert(Create, i))) 
Using Rule: (10) 
aanweners Leb pcenapacicooeiSesedbewsceanumaadewowsnends 
46(APPEND(c, Insert(Create, i))) = Append(Enqueue(36(c), 36(i)), Null) 


Expand Expression: Nullq 
Using Rule: (14) 


2M oo SCOT ROR MT RP SSE POPE SeeNOEE REE SERTS 


46(APPEND(c, Insert(Create, i))) = Append(Enqueue(36(c), 36(i)), 16(Create)) 


Expand Expression: Enqueue(36(0), 16()) 
Using Rule: (20) 


ee ROR EE Re Ses ore eRe E ES Ose ES EOE SSE HEE nae SESEESOS PeSrns EER SSeEEEES 


46(APPEND(c, Insert(Create, i))) = Append(36(ENQUEUE(c, i)), 16(Create)) 


Expand Expression: Append(36(ENQUEUE(ce, i)), 36(Create)) 
Using Rule: 


Se BE NOS SOO SONNE Eee Ee RE REEEe BE BeRsEEESeeRsEs SHEEN EE ZENS SE REA RES EERAS 


Step (6) 


4(APPEND(c, Insert(Create, i))) = J6(APPEND(ENQUEUE(, i), Create)) 


Generalize the theorem in step (5) by replacing the constant 
Create by the variable d to obtain the following equation: 
36(APPEND(c, Insert(d,i ))) = 4(APPEND(ENQUEUK(c, i), d)) 


Apply is-an-inductive theorem-of on the above equation. 
This yields True confirming that the equation is a theorem. 
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Hence the desired rule (obtained by dropping 36 on both sides) is: 
APPEND(e, Insert(d,i)) - APPEND(ENQUEUE(e, i), d) 


One can similarly generate a theorem of the form 36(APPEND(Create, d)) = 36(d) which gives rise to 
. the following rewrite rule to complete the preliminary implementation of APPEND. 


APPEND(Create, d) — d 


2.2.2 Stage2: Derivation of the Target Implementation 


In the second stage of the synthesis procedure, the preliminary implementation is 
transformed into a target implementation. It should be noted that the preliminary 
implementation is itself an executable implementation. It can be executed by an interpreter 
that is capable of simplifying algebraic expressions using the equations in the specifications of 
data types as rewrite rules. The data type verification system AFFIRM [39] provides such an 
interpreter. Given the specifications of all the implementing types, the interpreter can 


execute the preliminary implementation on any given input. Our goal is to derive the target 


implementation in a form that can be compiled by a compiler for an applicative language. _ 


There are two reasons why a target implementation is more efficient than a preliminary 


implementation. The first one arises because of the freedom to use nongenerators of the 
Tepresentation type in a target implementation. This makes it possible, in some instances, to 
eliminate recursion from a preliminary implementation of an operation, and to transform into 


one which is a composition of the operations of the implementing types. The second reason 


is that an implementation that can be compiled by means of a conventional compiler is in _ 


general more efficient than interpreting a set of rewrite rules. We investigate two methods of 
transforming a preliminary implementation into a target implementation. We describe each 
of them briefly below. The first method, although less efficient than the second, derives a 


larger set of implementations. 
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- 2.2.2.1 Recursion Eliminating Method 


~ : 


According to this method the problem of deriving a target implementation is viewed 


as finding a composition f* of ‘the operations of the implementing types and the 


- implementing functions (possibly including the if_then_else function) that has the same 


functional behavior as the implementing function F defined by the preliminary 
implementation. For example, the composition Rotate(Insert(d, k)) has the same behavior as 
the function ENQUEUE defined by the rewrite rules of the following preliminary 


implementation: 


ENQUEUE(Create, j) — Insert(Create, j) 
ENQUEUE(Insert(c, i), j) + Insert(ENQUEUE(, j), i) 


So, the — following can —_— ibe a target implementation for it: 
ENQUEUE(d, k) ::= Rotate(Insert(d, k)). Note that the target implementation does not use 
recursion. . 

More formally, the problem can be stated as follows: Find a co:nposition f* so that 
the equations obtained by substituting f* for ENQUEUE in the rewrite rules are theorems of 
the implementing types. The equations for ENQUEUE are given below. Note that, in 
obtaining the following equations, the two sides of the rewrite rules are interchanged after 
replacing ENQUEUE by f*. The need for the interchange will be explained later. 

(1) Insert(Create, j) = f*(Create, j) 
(2) Insert(f*(c, j), i) = f*(Insert(c, i), j) 

We use the following strategy to find a solution for f*. We generate a theorem of 
the implementing types using one of the above equations as a template. For generating such 
a theorem we use the synthesis rules mentioned earlier. However this time, since we are 
interested in the theorems of the implementing types, the rewrite rules in the specification of 
the implementing types are used for expansion. The theorem generated determines a 
candidate for f*. The goal is to generate a theorem so that the candidate for f* determined by 
the theorem also satisfies the other equation. -For instance, the sequence of steps given below 


generates a theorem that has the form of equation (1). 
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+ Rewrite Rules of Circ_List : . i 


. (3) Rotate(Create) — Create 
(4) Rotate(Insert(Create, i)) — Insert(Create, i) 
(5) Rotate(Insert(Insert(c, i1), i2)) + Insert(Rotate(Insert(c, i2)), il) 


_ Form of the theorem to be generated: Insert(Create, j) = f*(Create, j) 
- Normal form of Insert(Create, j): Insert(Create, j) 
Rules used for the normal form: None 


Step (1) Invoke Synthesis Rule (1) on Insert(Create, j) 
Insert(Create, j) = Insert(Create, j) 


Step (2) Expand Expression: Insert(Create, j) 
Using Rule: (4) 


Sees We eee seer Nees ees eeeEeesA sen eS ES a Senses Behe SeeeeweEnesennRASaS 


Insert(Create, j) = Rotate(Insert(Create, j) 


The last theorem generated in the above series suggests that Rotate(Insert(d, k)) is a 
candidate for f*(d,k). The candidate composition can be determined mechanically by 
comparing the theorem generated with the template equation. The candidate we currently 
have is such that the equation Rotate({Insert(Insert(c, i), j)) = Insert(Rotate(Insert(c, j)), i), 
which is obtained by replacing f* by Rotate ° Insert in equation (2), is a theorem of Circ_List. 
Had the candidate obtained in the last step not satisfied equation (2), the theorem generation 
would have continued further to generate another theorem that had the form of equation (1). 
The reason that the first equation, rather than the second, was used as the template 
equation is the following. The synthesis rules are formulated so that the unknown expression 
in the equation to be searched for is on the right hand side. In equation (2) both sides are 
unknown since f* occurs on both the sides. That is not the case with equation (1). This was 
also the reason for interchanging the two sides of the rewrite rules while obtaining the 
template equations. In the example illustrated the theorem desired was in the equational 
theory. In general, we need to use the generalization technique described earlier since the 


-31- 


+ theorem may be in the inductive thedry. 
2.2.2.2. The Recursion Preserving. Method 


In this method the target implementation is derived with the help of a special set of 
functions, called the inverting functions,’ on the representation type. To understand what 
" inverting functions are, and why there are useful, let us consider an example. The 


preliminary implementation of SIZE consists of the following rules: 


SIZE(Create) — 0 
SIZE(Ipsert(c, i)) + SIZE(c) + 1 


A target implementation for SIZE may take the following form: 


SIZE(d) ::= if Empty(d) then 0 
else SIZE(Remove(d)) + 1 


Note that in the preliminary implementation the argument to SIZE on the left hand 
side of a rule is permitted to be a generator expression, The argument indicates the pattern or 
the structure of the expression that constructs the values for which the rewrite rule is 
applicable,? This freedom.is used in a preliminary implementation to perform a case analysis 
based on the structure of the argument, and to decompose the argument. 

In a target implementation the. argument to SIZE on the left hand side of the 
definition must be a variable. This means that the expression on the right hand side of the 
definition must have explicit subexpressions for determining the structure of the argument, 
and to decompose the argument. Inverting functions of a data type can be used to build these 
subexpressions. 

Informally speaking, the inverting functions of a data type are functions that can be 


8. Inverting functions are closely related to distinguished functions of a data type defined in [24]. In 
[24] the distinguished functions are used to formalize the expressive power of a data type. 

9. If we are interested in ‘interpreting the preliminary implementation, it is, therefore, necessary for 
the interpreter to have pattern matching capability to invoke the appropriate rewrite rule while 
simplifying an expression. 
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; used to algorithmically invert the process of constructing a value of the type from the 
» generators of the type. In other words, by applying one or more of the inverting functions a 
~ finite number of times on a value one can determine a generator expression that constructs 
: the value. For instance, for Cire_List the operations Rotate, Value, and Empty can serve as a 
set of inverting functions. The structure of any circular list value in terms of Create and 
Insert can be determined using these operations. For instance, if vy is a variable denoting the 
value constructed by Insert(c, j), then Remove(v) extracts the component c; ~Empty(v) checks 
if vy is constructed by an expression of the form Insert(c,j). So, the rewrite rules can be 
merged into the following conditional. expressions: 
if Empty(d) then 0 else SIZE(Remove(d)) + 1. 

The target implementation is derived in two steps. The first step identifies a set of 
inverting functions for the representation type. In the second step the rewrite rules 
constituting the preliminary implementation of every operation are transformed into a target 
implementation in terms of the inverting functions. The method is described in detail in 


chapter 6. 
2.2.3 Extending the Synthesis Procedure 


Consider the association specification given in Fig. 6. It specifies a representation 
scheme for implementing Queue_Int as a triple Array_Int X Integer X Integer, which can 
informally be described as follows. (Array_Int is specified in the next chapter which also 


describes the association specification shown below in more detail.) Nullq can be represented 


Fig. 6. Queue_Int in terms of Triple 


A<v, i, D>) = Null 
A(<Assign(v, e, j), i j4+- 15) = ifi = j+1 then Nullq 
else Enqueue(A(<v, i, >), e) 


5(<v, i, D>) = True 
I(<Assign(y, e, j), i, j+ 1>) = if i = j+1 then True 
else if i < j+1 then i<v, i, D) 
else False 
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by any triple in which the two integer components are equal. A nonempty queue can be 
represented by a triple <v, i, j>, where v is an array of arbitrary length containing the elements 
" of the queue between the index values i and j-1, in order. In other words, i points to the front 
end of the queue, and j points to the next position available in the queue for adding an 
: element. Note that in this example, unlike the last one, not every value of the representation 
_ type can legally represent a queue. A triple <yv, i, > is a legal representation value if only if 
| i <j, and v is guaranteed to be defined on all index values between i and j-1. The invariant 3 
in Fig. 6 specifies this condition. . 

The synthesis the presence of a nontrivial invariant J has to be performed differently 
because the implementation must be such that every implementing function F defined 
preserves J: That is, (V y)[3(v) => 3(F(v))]. 

The synthesis procedure for such a situation is similar to the one described earlier 
except for the method employed in determining the right hand sides of the rules of a 
preliminary implementation. The difference lies in the set of rewrite rules used for expansion 
while generating the theorems. Earlier, the rewrite rules of PW were used, but now it is 
necessary to use an additional set of rewrite rules. The additional rewrite rules describe 
' information pertaining to the invariant J, and the assumption that the arguments to the 
implementing function satisfy the invariant. The information pertaining to J is maintained as 

a separate entity called the Temporary World. Chapter 5 describes how the Temporary World 


is constructed, maintained, and used in the synthesis of an implemenation. 
2.3 The Scope of the Synthesis Procedure 


The scope of the synthesis procedure is limited because of two reasons. Firstly, the 
restrictions imposed on the input specifications limit the range of data type specifications that 
are acceptable as inputs to the procedure. Secondly, the synthesis procedure is capable of 
. deriving only a class of implementations that satisfy certain properties. We describe the two 
forms of limitations below. 
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#2.3.1 Restrictions on the Inputs 


The input specifications must be such that the Initial World (IW), which is a 


combination of all the specifications, ‘forms a rewriting system that 
(1) has the uniform termination property, 
(2) has the unique termination property, and 
(3) is well-spanned. | 


The second and the third properties are not restrictive because they can be attained 
by adding certairi additional rewrite rules to the system. There are automatic procedures [28, 
38, 22] for determining the rules that need to be added, provided the system satisfies the 
uniform termination property. 

The uniform termination property can be restrictive. It is, in general, not possible to 
express all the properties one wishes to specify in a manner that preserves the uniform 
termination property. For example, consider the data type Set_of_Elements that has an 
operation Insert to insert an element into a set. To express the property that the order of 
insertion of elements into a set is immaterial, it is necessary to have a rewrite rule of the form 
Insert(Insert(s, i), j) + Insert(Insert(s, j), i) as a part of IW. A system containing this kind of 
rule need not, in general, terminate because the rule does not strictly reduce an expression. 

One way of getting around this problem is to exclude the concerned rule(s) from 
TW. However, there are two reasons why one may not want to do this. Firstly, the rule might 
be needed to attain the second and the third properties mentioned above. In such a situation 
excluding the rule(s) makes the input unacceptable. The second reason is that omitting the 
rule may leave the specification incomplete.!® The method used by the synthesis procedure 
does not require the specifications to be complete, so the input (excluding the concerned rule) 

_ in this case is acceptable. But the procedure will not be able to derive an implementation that 
is dependent on the property expressed by the rule. 


10. We use the following notion of completeness: A specification is complete if all the properties that 
are valid for the data type are provable from the specification. 
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> 2.3.2. The Class of Implementations Derived 


| There are three factors that are responsible for limiting the class of implementations 
derived by the procedure. The first is related to the subset of the proof theory of the input 
specifications in which the synthesis procedure operates. The procedure can only derive 
those implementations whose correctness proof is within the operational part of the theory. 
The operational part of the theory comprises the subset of the inductive theory that is decided 
by the Musser/Knuth-Bendix method [38] of proving inductive properties. 

The second limiting factor is the termination ordering >. The synthesis procedure 
assumes that an effective ordering is implicitly available to be used in ensuring the 
termination of the implementation. So, the procedure can only derive those implementations 
whose termination can be proved using the ordering >. The more general!! the ordering >, 
the larger is the class of implementations that can be derived. 

The third reason is that the implementations derived may not involve arbitrary 
helping functions. The synthesis procedure is not capable of automatically discovering a 
helping function that might be necessary in an implementation. The user has to furnish a 
specification of the helping function as a part of the Initial World if he wishes an 


implementation in terms of the helping function. 
2.3.3 Effects of Using the Procedure Outside its Scope 


Using the procedure on a specification that does not satisfy the uniform termination 
property may result in infinite looping. This is because, under such a circumstance, there can 
be expressions for which a normal form does not exist. The effect of a violation of the unique 
termination property depends on how serious the violation is. If the violation implies that the 
system is inconsistent, then the procedure may derive an incorrect implementation. However, 
if the system is consistent despite the violation, the effect will only be a reduction in the class 
of implementations that the procedure can derive. It should be noted that all three of the 


1]. An ordering >-, is considered to be more eae [23] than >, if >, contains >». That is, i 
relates a larger set of exDIeIORS than >. 
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propertics required of the inputs can be checked automatically (assuming that a termination 


ordering > is available). 
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#3. Inputs to the Synthesis Procedure 


This chapter has four sections. The first section defines data types and their 
. specification. The second section describes the association specification. The third section 
characterizes the restrictions on the inputs. The last section describes proving properties of 


_ data types from the specifications. 
3 Data Types and their Specification 
3.1.1 Preliminary Concepts 


A data type consists of a set (perhaps infinite) of values, called the value set, and a 
finite set of operations, called the operation set. The only way in which the values of a data 
type can be constructed, manipulated or observed is through the operations of the data type. 

The behavior of a data type is usually dependent on several other data types. These 
data types appear as a part of the domain or as the range of the operations of the data type 
under consideration. We call these other data types the defining types, the data type under 
consideration is referred to as the type of interest (TOI). \f the TOI is the. one that is being 
implemented, we refer to it as the implemented type. The type that is used to represent the 
implemented type is called the representation type. The defining types of the representation 
type are called the ancillary types. The union of the representation type and the ancillary 
types is called the set of implementing types. For example, the defining types of the data type 

‘Queue_Int specified in Fig. 7 are Integer and Bool. 

A data type has two kinds of operations. A constructor is an operation that yields a 
value of the TOI, and an observer is an operation that yields a value of a defining type. For 
Queue_lInt, the operations Nullq, Enqueue, Dequeue, and Append are all constructors; the rest 
of the operations are observers. 

We treat the exceptional behavior of a data type in a simplified fashion. We assume 
that every data type has a unique exceptional value that is constructed by the operation Error 
belonging to the type. The value Error( ) is treated like any other value of the type except 
that it has the following unique property. Every operation is assumed to be strict with respect 
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¢ to Error(): Every operation f is such that when applied to Error() from any of its domain 


» types it yields the exceptional value of the range type of f. We assume that every operation f 


: is a total function: That is, f is defined on every element of its domain yielding either an 


- exceptional value or a normal value from its range type. 


The requirement on a data type that its values be manipulated only by its operations 


translates to requiring that its values be constructed only by its constructors, possibly using 


: the values of its defining types. Furthermore, in a computer the values can be constructed 


. only by a finite sequence of operations, so the value set of a data type is the smallest set closed 


. under finitely many applications of its constructors. This property of a data type is called the 


minimality property {25}. 

A subset of constructors is said to be complete if every value of the TOI can be 
constructed by some composition of the constructors in the subset (possibly using values of 
the defining types). A basis for a data type is a complete set of constructors that is minimal, 
ie., no subset of a basis is complete. A data type may have more than one basis. { Nullq, 


Enqueue } is a basis for Queue_Int since all queues can be generated using Nullq and 


Enqueue, and no subset of it can do so. 


An expression (or a term) is a sequence of operations and variables denoting an 


. application of the operations to the variables.. The type ofan expression is the range type of 


the operation symbol that appears at the outermost level of the expression. A constant is an 


expression that does not contain any variables. For example, Dequeue(Enqueue(q, e)) is an 
expression of type Queue_Int; it is not a constant since it contains variables. 


Dequeue(Enqueue(Nullg, 0)) is a constant of type Queue_Int. 
3.1.2 Definition of a Data Type 


The only way in which the values of a data type can be manipulated is through the 
operations of the type. We define a data type so as to capture the behavior of the type as 
viewed through the operations of the type. This behavior is called the observable behavior of 
the data type. This method of definition was advocated by Guttag [16], and later developed 
by Kapur [25]. According to this view, the values of a data type are distinguishable only by 
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: means of the operations of the type. : 

: Heterogeneous algebras provide a natural means of modeling the behavior of a data 

type. A heterogeneous algebra that can be used to model a data type is defined recursively in 

~ terms of the algebra that is used to model each of its defining types. The basis of this 

_ recursion is the type Bool which does not have any defining types. 

| A heterogeneous algebra for a data type D, consists of (i) a domain corresponding to 

_D, which is called the principal domain, (ii) a domain corresponding to every defining type of 
D, Gii) a function corresponding to every operation of D. The elements of the principal 
domain are used to denote the values of D. The minimality property of a data type requires 
that every element of the domains of the algebra be constructible by a finite number of 
applications of the constructors of the appropriate type. Any heterogeneous algebra that has 
the appropriate signature, and that exhibits the desired observable behavior can be used to 
model the data type. Hence, we define a data type as a set of heterogeneous algebras that 
exhibit the same observable behavior. Every algebra in the set is said to be a model of the 
data type. The elements of the principal domain are called the values (of D) in that model. 
Below we formally characterize the observable behavior of a heterogeneous algebra. 

The observable behavior of a model is characterized in terms of the 
distinguishability relation on the values of the model. : The distinguishability relation is _ 
defined inductively in terms of the distinguishability of the values of the defining types. That 
is, we assume that the distinguishability relation is already defined the domain corresponding 
to each of the defining types. (The basis of this induction is the data type Bool that does not 
have any defining types; the only two values, True and False of Bool are assumed to be 
distinguishable.) Two values of a model are distinguishable if and only if there is a sequence 
of operations of D with an observer as the outermost operation, that produces distinguishable 
results when applied separately on the values. If two values are not distinguishable, they are 
observably equivalent. For instance, the Queue_Int values constructed by. Enqueue(Nullq, 9) 
and Append(Nullq, Enqueue(N ullg, 0)) are observably equivalent; but the ones constructed by 
Enqueue(Nullq, 0) and Dequeue(Enqueue(Nullg, 0)) are distinguishable. Observable 


equivalence is an equivalence relation. 


’ Definition Two models are behaviorally equivalent if their quotient models induced by the 


observable equivalence relations are isomorphic to each other. 
: Definition A data type is a set of behaviorally equivalent heterogeneous algebras. 


3.1.3 Specification of a Data Type 


The specification of a data type is a piece of text in a formal language. It describes a 
set of properties concerning the operations of the data type. The aim of writing a 
specification is to characterize through the specification the observable equivalence relation 
that defines the data type. 

It has been observed [17] that the construction of an algebraic specification for a 
data type is rendered easier and more reliable (in the sense that one has increased confidence 
in the consistency and completeness of the specification) by using a basis of the data type as a 
guide for constructing the specification. We assume that all our specifications are constructed 
in this fashion. The operations belonging to the basis of a specification are called the 
generators of the specification. An operation that is not in the basis is called a non-generator. 
Note that all generators are constructors; non-generators may be constructors or observers. 

Throughout the development when we refer to the basis or the generators of a data 
type involved in the synthesis, we actually mean the basis or the generators associated with 
the specification of the data type being used as an input to the synthesis procedure. 
Definition of a couple of new terms pertaining to the generators are in order at this point. A 
generator expression (generator constant) of a data type is.an expression (constant) that 
consists of only the generators of the type. Taking Queue_Int with the specification given in 
Fig.7. as an example: Enmqueue(Nullq,0) is a generator constant whereas, 


Dequeue(Enqueue(Nullq, 0)) is not a generator constant, because Dequeue is a non-generator. 


eh Ie 


+ 3.1.3.1 The Specification Language 


The specification language we use is a restricted version of an equational language 
- that permits conditionals and auxiliary functions. The language is similar to the ones used in 
several other works on data type specification and verification such as [{14, 18, 25). A 
specification has two parts: the Operations part describes the functionality of every operation 
| of the TOI; we assume that the Operations part identifies the basis used for the specification. 
; The Axioms part consists of a set of axioms déscribing the properties of the operations, Every 
axiom has the form of an equation e, = e,, where e, and e, are expressions of the same type. 
_ The expressions may involve any of the operations of the TOI and the defining types. . The 
expressions may contain any of a finite number of auxiliary functions which are also specified 
as part of the specification. The equations may involve conditional expressions on their right 
hand side, ie., €, may contain the auxiliary function if_then_else which behaves like a 
conditional expression, !2 For the sake of clarity, we use the following more conventional 
syntax for an expression involving if_then_else. The expression if_then_else(b, >) e,,) is 
written as if b then e,, else e,.. 

We differ from the works cited above by assuming that every axiom in the 
specification satisfies the following syntactic constraints. The constraints are not restrictive, in 
the sense that they do not restrict the class of data types that can be specified. The first 
constraint enables us to automatically partition the axiom set into two disjoint sets: One that 
contains only the generator symbols; the other whose axioms may involve generators as well 
as nongenerators. The partitioning of the axiom set facilitates the synthesis process by 
reducing the inter-dependence of the synthesis of different operations. The second constraint 
permits the axioms to be treated as left to right rewrite rules (to be described later) without 


having to interchange the two sides of the axioms. 


12. if_then_else can be specified by the following two equations. 
if_then_else : Bool X T XT -> T 
if_then_else(True, ee) =e, | 
if_then_else(False, e,, €,) = e, 


- 42 - 


‘ Every axiom e, = e, of a specification satisfies the following conditions: | 
(1) Every data type specification explicitly identifies a basis, i.e., a set ‘of generators. 


(2) The set of variables in e, is a subset of the set of variables in e.. 


2 


Figures 7 and 8 show specifications of a (FIFO) queue of integers (Queue_Int) and a circular 


- list of integers (Cire_List). The specifications meet the constraints specified above. 
3.1.3.2. Semantics of a Specification 


The specification of a data type characterizes the observable equivalence relation 
that defines the data type. The semantics of a specification is a set of heterogeneous algebras 
that are behaviorally equivalent based on the observable equivalence relation characterized 
by the specification. 

To determine the observable equivalence relation characterized by a specification, 
| the symbol ’=’ in the axioms of the specification should be read as ’observably equivalent’. 
For instance, the equation Sizc(Enqueue(q, e)) = Size(q) +1 in the specification of 
Queue_Int asserts that the two expressions yield observably equivalent values for all 
instantiations of the variables in them. The observable equivalence relation characterized by 
the specification is the reflexive, symmetric, transitive closure of =. Every algebra that 
satisfies all the axioms in the specification is a model of the type being specified by 


specification. 
3.2 Association Specification 


In addition to the specifications of the types involved in the synthesis, the synthesis 
procedure expects the user to provide information about the representation scheme to be 
used by the implementation that is to be derived. This section explains what exactly that 
information is, and how it can be specified. We call the formal description of the information 


the association specification of an implementation. 
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_ Fig. 7. Specification of Queue_Int 
Queue_Int is Nullq, Enqueue, Front, Dequeue, Append, Size 


Defining Types 
Bool, Int 


Operations 


Nullq > Queue_Int 

’ Enqueue : Qucuc_Int X Int -> Queue_Int 

Front  : Queuc_Int -> Int U { ERROR } 
Dequeue : Queuc_Int -> Queue_Int U { ERROR } 
Append : Queue_Int X Queue_Int -> Queue_Int 


Size : Queuc_Int -> Int 
Basis 

{ Nullg, Enqueue } 

Axioms 


(1) Front(Nullq) = ERROR 
(2) Front(Enqueue(Nullg, e)) = e 
(3) Front(Enqucuc(Enqueue(q, 1), ¢2)) = Front(Enqueue(q, c1)) 


(4) Dequeue(Nullq) = ERROR 
(5) Dequeue(Enqueuc(Nullq, e)) = Nullg 
(6) Dequeue(Enqueue(Enqueue(q, 1), e2)) = Enqueue(Dequeue{Enqueue(q, e1)), e2) 


(10) Append(q, Nullq) = q 
(11) Append(q1, Enqueue(q2, e2)) = Enqueue(Append(ql, q2), e2) 


(12) Size(Nullq) = 0 
(13) Size({Enqueue(q, e)) = Size(q) + 1 


Fig. 8. Specification of Circ_List 
Circ_List is Create, Insert, Value, Remove, Rotate, Empty, Join 


Defining Types 
Integer, Boolean 


Operations 


i Create :-> Circ_List 

Insert : Circ_List X Integer -> Cire_List 
Value —: Circ_List -> Integer U { ERROR } 
Remove: Circ_List -> Circ_List U { ERROR } 
"Rotate — : Circ_List -> Cire_List 
Empty =: Circ_List -> Boolean 
Join : Circ_list X Circ_list -> Circ_list 


~ Comment 

' Circ_List is a list of integers with a front end and arear end. Create constructs an empty list; the front 

. and the rear ends of an empty list are the same. Insert inserts an element into a list at the rear end. 
Value returns the element at the rear end of a list. Remove removes the element at the rear end from a 
list. Rotate moves every element in a list by one position towards the rear end in a cyclic fashion, i.e., 
the element at the rear is moved to the front. Empty checks if a list is empty. Join joins two lists by 
positioning the first argument in front of the second. 


Basis 
{Create, Insert} 


Axioms 


(1) Value(Create) = ERROR 
(2) Valuc(Insert(c, i)) = i 


(3) Remove(Create) = LRROR 
(4) Remoye(Insert(c, i)) = ¢ 


(5) Rotate(Create) = Create 
(6) Rotate(Insert(Create, i)) = Insert(Create, i) 
(7) Rotate(Insert(Insert(c, i1), i2))) = Insert(Rotate(Insert(c, i2)), il) 


(8) Empty(Create) = true 
(9) Empty(Insert(c, i)) = false 


(10) Join(c, Create) = c 
(11) Join(c, Insert(d, i)) = Insert(Join(c, d), i) 
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13.2.1 What is an Association Specification ? 


An association specification characterizes two pieces of information about a 


‘Tepresentation scheme: 


(1) The set of values of the representation type that an implementation may use in 
representing the values of the implemented type. We call this set the representing 
domain (%). % is characterized by means of a predicate on the representation type 
called the invariant(3): % is the set of values of the representation type for which § 


is True. 


Q)A function, called the abstraction function, from the values of the representation type 
to the values of the implemented type. The function corresponds to the 
representation function of a data type introduced by [21]. The abstraction function 
maps a representation value to an abstract value that the former may represent in an 
implementation. An abstraction function may be a many-to-one function. An 
abstraction does not have to be defined on every value of the representation type. - 


However, it has to be defined on every value in the representing domain. 


The information characterized by the association’ specification is often the most 
creative part of an implementation. The proof of correctness of an implementation also, in 
general, needs to use information such as this. If the invariant part of an association 
specification is vacuous, then we assume that the invariant is true on all values of the 
representation type. In such a case the representing domain includes all the values of the 


representation type. 
3.2.2 How Is It Expressed ? 


We specify J and A using the same language that is used to specify the data types 
involved. J is specified as a set of equations, like any other predicate on the value set of the 
representation type. A is specified as a set of equations relating expressions of the 
representation type to expressions of the implemented type. We require that A be specified 
as a well-defined function with a nonempty domain. - 
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, Fig. 9. Two Association Specifications for Queue_Iint 


9a) Queue_Int in terms of Circ_List 


- A(Create) = Nullq 
A(Insert(c, i)) = add_at_head(A(c), i) 


add_at_head(Nullq) = Enqueue(Nulla, i) 
add_at_head(Enqueue(q, i), i1) = Enqueue(add_at_head(q, il), i) 


%b) Queue_Int in terms of Array_Int X Int X Int 


A{&y, i, D} = Nullq 
A(<Assign(y, e, j), i, j+ 1>) = if i = j+1 then Null 
else Enqueuc(.A(<v, i, j>), e) 


5(<Xv, i, D) = True 

5(<Assign(y, e, j), i, j+ 1>) = ifi = j+1 then True 
else if j4+-1< i then False 
else J(<y, i, j>) 


Fig. 9 gives a couple of example of an association specification. 9a) specifies an 
implementation of Queue_Int in terms of Cire_List. The empty queue is represented by the 
empty list; a nonempty queue is represented by a list whose elements are identical to the ones 
in the queue, but are arranged in the reverse order. The motivation for this representation 


scheme is that reading and deletion of elements from a queue can be performed efficiently. 


Consider the association specification given in Fig. 6. It specifies a representation 
scheme for implementing Queue_Int as a triple , which can informally be described as 
follows. (Array_Int is specified in the next chapter which also describes the association 
specification shown below in more detail.) | 


Fig. 9(b) specifies an implementation in which a queue is implemented as a triple 
Array_Int X Integer X Integer. (Array_Int is specified in Fig. 10.) The representation scheme 
can be informally described as follows. Nullq can be represented by any triple in which the © 
two integer components are equal. A nonempty queue can be represented by a triple <y, i, D, 
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; Fig. 10. Specification of Array_Int . 


Array_Int is Nullarr, Assign, Read, Size, Empty 


: Defining Types 


" Integer, Boolean 


_ Operations 


~ Nullarr :-> Array_Int 


Assign —: Array_Int X Integer X Integer -> Array_Int 
Read : Array_Int X Integer -> Integer U { ERROR } 
Size : Array_Int -> Integer 


. Empty —: Array_Int -> Boolean 


’ Comment 


Array_Int is an array of integers. Every element in the array is indexed by an integer; the indices are 
not necessarily contiguous. Nullarr creates an empty array. Assign assigns a given value (the second 
argument) to the element at a given index (the third argument); if the array does not have an element 
with the given index, then the valuc is added to the array. Read reads the clement at the given index. 
Empty checks if an array is empty. 


Basis 

{Nullarr, Assign} 

Axioms 

(1) Assign(Assign(y, e1, il), e2, i2) = if il = i2 then Assign(y, e2, i2) 
else Assign(Assign(v, e2, i2), el, il) 


(2) Read(Nullarr, i) = ERROR 
(3) Read(Assign(v, e, j), i) = ifi = jthene 
else Read(y, i) 


(4) Empty(Nullarr) = true 
(5) Empty(Assign(y, e, i)) = false 
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-where y is an array of arbitrary Jength containing the elements of the queue between the 
index values i and j-1, in order. In other words, i points to the front end of the queue, and j 
points to the next position available in the queue for adding an element. | 

Note that in this example, unlike the last one, not every value of the representation 
type can legally represent a queue. A triple <v, i, j> is a legal representation value if only if 
_{ <j, and v is guaranteed to be defined on all index values between i and j-l. The invariant 3 
in specifies this condition. 

The abstraction furiction A is specified so that it is defined on all values for which 5 
is True. The specification uses an auxiliary function Add_at_head. Add_at_head is a function 
on Queue_Int that adds a given element at the front of a queue. A specification of 


Add_at_head is given as a part of the association specification. 
3.2.3 Further Discussion on Association Specification 


It is important to note that every association specification need not have an 
implementation corresponding to it. To understand this more clearly, let us look at the 
relationship between an association specification and an implementation that uses a 
representation scheme consistent with the one characterized by the association specification. 


An implementation of a data type consists of 
(i) a representation type being used as the representation for the implementation. 


(ii) a program, i.e., a segment of code, for every operation of the type in a language; this 
program is called the implementation of the corresponding operation. 


Note that both a preliminary implementation and a target implementation (as introduced in 
the previous chapter) of a data type are implementations of the data type. A preliminary 
implementation uses one language to express the program, while the target implementation 
uses another. | 

Formally, an implementation of a data type can be considered to be denoting a 
heterogeneous algebra, called an implementation algebra, with 
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(i) a principal domain that is a subset of the value set of the representation type, 


(ii) a domain corresponding to every defining type of the implemented type - this 


domain is identical to the value set of the corresponding defining type, 


(iii) a function corresponding to the implementation of every operation of the 
implemented type so that the function mimics the behavior of the implementing . 


program, 


An implementation of a type is correct if there exists a homomorphism, from the 
implementation algebra to to the implemented type. The association specification should be 
such that there exists an implementation algebra with computable functions that corresponds 
to the representation scheme characterized by the association specification. More specifically, 


the implementation algebra should satisfy the following conditions: 


(i) The principal domain of the algebra is the representing domain characterized by the 


association specification. 


(ii) There is a computable function in the algebra with the appropriate functionality 
corresponding to every operation of the implemented type. | 


(iii) The implemented data type is a homomorphic image of the implementation algebra 


with respect to the abstraction function. 


We do not intend to formally characterize the properties that the association specification 
ought to satisfy so that it meets the above requirement. Rather, we trust the intuition of the 
user, and assume that there exists an implementation that is consistent with the association 
specification furnished by him. If the association specification provided as an input to the 
synthesis procedure is such that there is no implementation corresponding to it, then the 
synthesis procedure will, in general, never terminate. The synthesis method, however, does 


not produce an incorrect implementation in such a case. 


¥ 


i 
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3.3 Restrictions onthe Inputs - 1 


The method used by the synthesis procedure to derive an implementation is based 


_ on treating every equation in the specifications as a rewrite rule. The procedure combines all 


the input specifications, and treats the union as a set of rewrite rules called the Jnitial World. 


The restrictions imposed on the inputs are intended to ensure that the Initial World satisfies a 


_ useful property called the principle of definition. 


The first subsection informally introduces the basic concepts about rewrite rules. 


(See Appendix! for formal definitions.) The second subsection defines principle of 


definition, and develops a sufficient set of conditions for principle of definition (SCPD). The 
input is expected to satisfy SCPD. The third subsection describes how to prove properties 
from a specification that satisfies SCPD. 


3.3.1 Rewrite Rules and Rewriting Systems 


A rewrite rule is an ordered pair (left, right), written left — right, where left and 
right are expressions containing variables so that the variables in right are among the 
variables in left. A rule is used to reduce an expression by replacing any subexpression that is 
matched by left with a corresponding version of right, i.e, with the same substitutions for 
variables that were made in matching left. (More precise definitions are given in Appendix I.) 

For example, consider the rule 
Append(q,, Enqueue(q,, i,)) +> Enqueue(Append(q,, q,), i,), and the expression 
a = Dequeue(Append(q,, Enqueue(Nullq, 0))). @ is reducible using the rule because it has a 
subexpression a' = Append(q,, Enqueue(Nullq, 0)) that has the form of the left hand side of 
the mule: ‘That is,  Append(q,, Enqueue(q,, i,)) becomes identical to 
Append(q,, Enqueue(Nullq, 0)) when the variables in the former are substituted according to 
the substitution o = [q,++ 4,4, ++ Nullg,i, ++ 0]. The corresponding instance of the right 
hand side of the rule (obtained by substituting the variables in Enqueue(A ppend(q,, q,), i) 
using the substitution o) is B' = Enqueue(Append(q,, Nullq), 0). 
p= Dequeue(Enqueue(Append(q,, Nullq), 0)) is the expression obtained by replacing a' by 


B' in a. Then, we say that a reduces to B, written a — B. 
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A rewriting system is a set of rewrite rules. Let R be a rewriting system. An 
expression a is reducible by R if it is reducible by some rule in R. If @ is not reducible by any 
rule in R, then a is irreducible by R. © 

If a — B by arule in R; then we say that a directly reduces to B using R, and once 
again write it as a — B (using R). Let -+* be the smallest relation on pairs of expressions 
which is the reflexive, transitive closure of +. Thus, a —* £8 if and only if there exist 
a,—a,,, fori = 0,...,n-1 and 


expressions a,,a,,---,a,, where n > 0, such that a = a 


p ee 
a, = 8. We read a -* B asa reduces to B. 


0’ 


Suppose a —* £, and £ is irreducible. Then we say that a simplifies to B; B is called 
a normal form of'a (in R). 

Rewriting systems are used to simplify expressions into their normal forms. Thus, a 
useful property of a system is uniform termination: R has the uniform termination property if 


no infinite sequence of reductions, a, + a, — ..., is possible in R. When R has the uniform 


0 
termination property every expression is guaranteed to have a normal form. Another useful 
property of a rewriting system is unique termination: R has the unique termination property if 
any two terminating sequences of reductions starting from the same expression have identical 
final expressions. When R has the unique termination property the normal form (if it exists) 
of every expression is unique. A rewriting system that has both the uniform termination 
property and the unique termination property is said to be convergent. When R is convergent 
every expression a has exactly one normal form; we denote the unique normal form of a ina 
convergent system by as. 

The rewriting systems corresponding to our input specifications are obtained by 
simply replacing the symbol ’=’ by the symbol ’—’ in each of the equations in the 
specifications. For example, Fig. 11 gives the rewriting system corresponding to the 
specification of Queue_Int in Fig. 7. Henceforth, we treat the input specifications as rewriting 
systems obtained as explained above. When we refer to a specification, we actually mean the 
rewriting system obtained from the specification. 
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+ Fig, 11. The Queue_Int Rewriting System 


~ (1) Front(Nullq) - ERROR 
. (2) Front(Enqueue(Nullg, e)) — e 
(3) Front(Enqueue(Enqueue(q, el), e2)) — Front(Enqueue(q, e1)) 


(4) Dequeue(Nullq) + ERROR 
(5) Dequeue(Enqueue(Nullg, c)) + Nullq 
’ (6) Dequeue(Enqueue(Enqueuce(q, e1), e2)) + Enqueue(Dequeue(Enqueue(gq, ¢1)), e2) 


(10) Append(q, Nullq) > q 
(11) Append(ql, Enqueue(q2, e2)) ++ Enqueue(Append(q1, q2), e2)° 


(12) Size(Nullq) — 0 
(13) Size(Enqucue(q, e)) — Size(q) + 1 


3.3.2 The Principle of Definition 


The principle of definition is a property of a specification (or a group of 
specifications). The property ensures the consistency of a specification. The property 
reinforces the two-tier characteristic inherent in our specifications: It ensures that the 
generators are specified among themselves, and the nongenerators are specified as total 
functions in terms of the generators. Finally, the property is useful in mechanically proving 
properties of data types from their specifications. The property is similar to a property with 


the same name defined in [22]. Our definition is more general than the one in [22]. 


Definition The Principle of Definition 


A specification (or a group of specifications) S has the principle of definition property if every 
constant t has exactly one normal form (in S), and the normal form is a generator constant of 


the appropriate type. 


There will be situations in our development when it is necessary to use a restricted 
version of the principle of definition. The notion is restricted in the sense that the principle 
of definition need hold good only for a subset of terms. The restricted property is useful in 
stating that every nongenerator defined by a system be defined as a total function on a subset 
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¢ of the value set of a type. We give a definition the property below. 


. Definition Principle of Definition With Respect T 


Let T be a set of generator constants not necessarily including all possible constants. A 
. system S satisfies the principle of definition with respect to T if the following condition holds: 

Every constant of the form F(g,,...,g,), where F is a nongenerator function symbol and . 
. £,--+»8, are generator constants in T, has a unique normal form (in S) that is a generator 


" constant in T. 


The principle of definition has two parts to it: It requires every constant to have a 

unique normal form in S, and the normal form to be a generator constant. SCPD has to be 

- formulated so as to ensure the two parts. The first part can be ensured by requiring S to be 
convergent (i.e., to satisfy the uniform termination property and the unique termination 
property), The second part is ensured by requiring S to be well-spanned. We define what it 
means for S to be well-spanned below, and then show how the two properties ensure the 
principle of definition of S. | . 

Consider the rewriting system shown in Fig. 11. The system has three rules (1, 2, 
and 3) in which the expression on the left hand side has Front as its outermost symbol. The 
set, {Nullgq, Enqueue(Nullq, e), Enqueue(Enqueue(q, e1), e2)}, of generator expressions that 
appear as arguments to Front on the left hand side in the rules spans the entire set of 
generator constants of Queue_Int; in other words, every generator constant of type 
Queue_Int is an instance of one of the expressions in the above set. When a rewriting system 
has enough rules corresponding to a nongenerator function f so that the set of generator 
expressions appearing as arguments to f spans the set of all generator constants, we say that f 
is well-spanned by the rewriting system. We say that a rewriting system is well-spanned if 
every nongenerator function symbol of the system is well-spanned. We formalize this notion 
below. 

In general, since f can be multi-ary, the arguments to f are k-tuples of expressions of 
appropriate types, where k is the arity of f. In the following formalization, we first define the 


notion of a set of k-tuple of generator expressions being well-spanned, informally, a set of 
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¢ k-tuples of generator expressions is well-spanned if it spans the set of all k-tuples of generator 
- constants of appropriate types. The property of a function being well-spanned is defined in 
> terms of the notion of a well-spanned set of k-tuple of generator expressions. In the 
! following, we assume that the k-tuples are homogeneous with regard to the types of their 


. components. The extension to the heterogeneous case is simple. 


- Definition A set A = {A,,-++, A,} of k-tuples of generator expressions A, = <e,,,..., @,> is 
well-spanned if the following condition holds: For every k-tuple, <t,, ete st, of generator 
constants there exist n, 1 < n < p, and a substitution o, such that for every j, 1<j<k, we 


have t, = o(e,,). : 


Definition A nongenerator function f is well-spanned by a rewriting system R if there is in R a 


set of rewrite rules whose left hand sides are of the form fie F e.), 1 <i <p, and the set 


wees 
{<€,,--.,€ >] 1 <i <p} is complete. : 


Definition A rewriting system R is well-spanned if every nongenerator function symbol in R is 


well-spanned. 


Definition A specification. S satisfies the sufficient condition for the principle of definition 
(SCPD) if S satisfies the following conditions: 


(i) Sis convergent 


(ii) Sis well-spanned. 


Lemma If S satisfies SCPD then S satisfies the principle of definition. 


Proof Condition (i) guarantees that every constant has exactly one normal form. Condition 
(ii) implies that en constant of the form f(g,,... > 8)» where f is a nongenerator and 
Z)--->8, are generator constants is reducible. Since S satisfies uniform termination 
property, this means that no constant with a nongenerator can be a normal form. Hence the 
normal form of every constant is a generator constant. 

QED 
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i: 3.3.3. Checking the Principle of Definition 


t 


. The main reason for formulating SCPD is so that we might be able to develop 
effective methods of checking if a specification satisfies the principle of definition. This 
" section sheds some light on this ik 
To check if a specification is well-spanned, we have to check if the set of expressions 
| (or k-tuples of expressions) that appear as arguments to each of the implementing functions is 
complete, Huet in [22] has demonstrated that it is possible to come up with an effective set of 
conditions that is sufficient to check if a set of expressions is complete. 
Checking the convergence of a set of rules, which forms the remaining condition of 
SCPD, has been investigated in [28, 22]. The result in the cited works, which is due to Knuth 
and Bendix, provides an algorithm (henceforth referred to as the KB-algorithm) to check the 
convergence of a finite set of rewrite rules that satisfies the uniform termination property. 
Thus, if we can independently ensure the uniform termination property of a specification, 
~ then we can use the KB-algorithm to show the unique termination property of the 


specification. 
3.3.3.1 Checking Unique Termination 


Let R be a finite set of rewrite rules that has the uniform termination property. The 
following theorem is the basis for the KB-algorithm for checking the unique termination 
property. The theorem depends upon the concept of unification of expressions. We will first 
define this concept. 

Two expressions a and 8 with disjoint variable sets are said to be unifiable if there 
exists a substitution @ such that 6(a) = a(g).43 The most general unifier of two unifiable 
expressions a and 8 is the unifier @, such that for any unifier o of a and £ there exists a 
substitution p such that o is the composition of p and @. The unification algorithm of 


Robinson [44] can be used to determine a most general unifier of two given expressions or 


13. The symbol = stands for two expressions being identically equal. 
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+ decide that they are not unifiable. In the discussion to follow we assume that the candidates 
. for unification have variables renamed if necessary to obtain disjoint variable sets. 
| Let y, + 5, and y, - 6, be two rules of R so that y, is unifiable:with a nonvariable 
subexpression of y,. More precisely, there exists an occurrence u in y, such thata = y,/uis 
: not a variable, and a is unifiable with y,. Let 6 be the most general unifier of a and y,. 
| Then, we say that @(y,) is a superposition of y, on y,, (If B is either a superposition of y, on y, 
or a superposition of y, on y,, then we say that 8 is a superposition between y, and y,.) To 
"each superposition there corresponds a critical pair <a,, a> of expressions defined as follows. 


"a, and a, are the expressions obtained by applying to 6(y,) the above two rules, respectively. 


More precisely, 
a, = Hy,u~ 0(5,)] 
a, = 6(6,) 


For example, consider the following rules 

Append(ql1, Enqueue(q2, i2)) — Enqueue(Append(ql, q2), i2) 

Append(Append(q3, 94), q5)) + Append(q3, Append(q4, q5)) 

y, is unifiable with the entire expression 7, by the most general unifier @ = [Append(q3, q4) 
for ql, Enqueue(q2, i2) for q5], yielding the superposition « and the critical pair <a,, a,> 
shown below: . 

a = Append(Append(q3, 44), Enqueue(q2, i2)) 
“a, = Enqueue(Append(Append(q3, 44), q2), i2) 

a, = Append(q3, Append(q4, Enqueue(q2, i2))) 


Theorem 1 The KB-Theorem 


If R has the finite termination property, then it has the unique termination property if and 
only if every critical pair <a,, a,> of R has the property that a, and a, have identical normal 


form. 
Proof For a proof see [28, 22]. 


If a finite rewriting system has nc: superpositions, and therefore, no critical pairs, it is said to 
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+ be superposition-free. Thus, we trivially have: 


Corollary If a finite rewriting system has the uniform termination property, and is 


. superposition-free, then it has the unique termination property. 


‘For example, the rewriting system in Fig.11 corresponding to Queue_Int is 
superposition-free. In the next subsection we show that it satisfies the uniform termination 


property. So the rewriting system is convergent. 
' 3.3.3.2. Checking Finite Termination 


A general technique for checking termination of a rewriting system R is to 
demonstrate that it is possible to define a well-founded partial ordering > on the set of all 
constants (that can be constructed using the function symbols in R) so that t, — t, implies 
t, >t, A partial ordering is well-founded if there are no infinite descending sequences such 
as t, > t, >... for any constants. Hence, there cannot be any infinite sequence of rewrites 
using R also. Appendix II goes into this topic in greater detail. It describes a theorem that 
provides a useful guideline to define a suitable partial ordering to check the uniform 
termination property of a rewriting system. a 

We assume that a well-founded partial ordering > on expressions is available as an 
input to the synthesis procedure. The ordering > is used by the synthesis procedure not only 
to ensure the uniform termination property of inputs, but also to ensure that the output 
synthesized terminates. The orderings used in our examples belong to a class of orderings, 
called the /exicographic recursive path ordering [26, 10}. A formal definition of the ordering is 

_given in Appendix II. | 


3.4 Proving Properties of a Data Type 


The properties of a data type we are interested in are always expressed as equations 
of the form e, =€,, where e, and e, are expressions, and = denotes the observable 
equivalence _ relation (see sec. 3.1.2). For instance, the property 
Append(Append(q,, q,), 4,) =. Append(q,, Append(q,, q,)) asserts that for every instantiation of 


- 58 - 


‘the variables by values the expressions on the two sides of the equation yield observably 
~ equivalent values. Our objective is to prove a property as a theorem from a specification of 
. the type. This is crucial to our work because synthesis of implementations involves searching 
‘for appropriate theorems of the input specifications. In the following, we describe how to 


mechanically prove theorems from a specification that satisfies the principle of definition. 


Definition A Theorem of a Specification 
. Let S be a specification (or a group of specifications). Let o be a substitution that maps 
variables to generator constants. An equation e, =e, is a theorem of S if for every o the 


constants o(e,) and o(e,) have identical normal forms. 


Note that the above definition of a theorem gurantees that if e, = e, is a theorem of S thene, 
and e, always yield observably equivalent values. This is because the principle of definition 
ensures that for every instantiation of the variables (in e, and e,) by generator. constants the 
two expressions simplify to the same. generator constant. This provides a basis for developing 
a method for mechanically proving properties of data types from specifications, 

Note that the reverse of the above implication is not true. This is because we — 
require that the input specifications be only consistent (via the principle of definition), but 
not complete [25]. A specification S of a data type D is complete if every equation e, = e 
such that e, and e, are observably equivalent for D is a theorem of S. The synthesis 
procedure would be more productive if the input specifications are complete. This is because 
it is possible to prove more properties from a complete specification, and hence the synthesis 
procedure might be able to derive a larger class of implementations. 

There are several ways in which the above result can be used to deduce that an 
equation is a theorem of a specification. The methods differ in the reasoning or logic used for 
the deduction. In our development we deal with two kinds of logic: the equational logic, and 


the inductive logic. 


Equational Logic 


In the equational logic e, = e, is deduced to be a theorem of S by checking if e, and 
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: e, have the same normal form in S.: Note that ife,+ = e,+, then it is obvious that e, and e, 
have identical normal forms for every substitution of the variables by generator constants. (e+ 
denotes the normal form of e.) An equation deduced to be a theorem ofS in this fashion is 
said to be a theorem in the equational theory of S. When S satisfies the principle of 
' definition every expression is guaranteed to have a unique normal form. Therefore, it is 
possible to develop a general procedure to decide the entire equational theory of S. As an 
~ illustration, we give a proof of 
Append(Append(q,, q,), Nullq) = Append(q,, Append(q,, Nullq)) using the specification of 
* Queue_Int shown in Fig. 11. 


Equation to be proved: Append(Append(q,, q,), Nullq) = Append(q,, Append(q,, Nullg)) 


Normal form of Ieft hand side: Normal form of right hand side: 
Append(Append(q,, q,), Nullq) Append(q,, Append(q,, Nullq)) 
Rule (10) \ . | Rule(10) 
Append(q,, q,) 3 Append(q,, q,) 
Inductive Logic 


A property © is deduced to be a theorem in the inductive logic by using, besides the 
reduces relation —*, some form of mathematical: induction. A property that is deduced 
using the inductive logic is called a theorem in the inductive logic. The set of all properties 
that can be deduced from a specification using the inductive logic is called the inductive 
theory of the specification. 

The induction used is carried over the set of all generator constants using one or 
more of the variables in ® as parameters for the induction. The induction is based on any 
well-founded partial ordering on the set of generator constants. Suppose G is the set of all 
generator constants, and > is a well-founded partial ordering on G. Suppose we are using 
the variable v of @{v) as the parameter of induction. Then the induction rule may be stated as 


follows: 


Induction rule 


eae 


* If for every t € G we can show that, for every t' € Gsuch that t — olv/t'] => $[v/t], then 


' &(¥) is theorem. 


To apply the induction tule, we have to define a partial ordering > on G. Since G | 
_ can, in general, be infinite the definition of > is usually recursive. The step of showing 
olv/t'] => @[v/t], for every t > t', is fragmented into several cases, Each of these cases is 
. established using the relation -+* as was done in the equational logic. Fig. 12 gives an 
example of an inductive proof. It —_— proves the property 
Append(Append(q,, q,), q,) = Append(q,, Append(q,, q,)) from the specification of Queue_Int 
given in Fig.11. The proof uses an ordering generated by the following relation on the 
generator expressions of Queuc_Int: Enqueue(q, i)> Nullq, and Enqueue(q, i) > q. The 
proof uses the variable q, as the parameter of induction. | 


It is not possible to develop a general procedure to decide the entire inductive 


Fig. 12. Proof by Inductive Logic 

Theorem to be proved: Append(Append(q,, q,), 4,) = Append(q,, Append(q,, q,)) 
Basis: q, ++ Nullq 

To prove: Append(Append(q,, q,). Nullg) = Append(q,, Append(q,, Nullq)) 
Proof is demonstrated above. 

Induction: q, ++ Enqueue(q, si) 


Hypothesis: Append(Append(q,, q,), q) + Append(q,, Append(q,, q)) 
To prove: Append(Append(q,, 4,), Enqueuefq, i)) = Append(q,, Append(q,, Enqueue(a, i))) 


Normal form of left hand side: Normal form of right hand side: 
Append(Append(q,, q,), Enqueue(q, i)) Append(q,, Append(q,, Enqueue(aq, i))) 
Rule(1i1) | | Rule(11) 
Enqueue(A ppend(Append(q,, q,), @), i) Append(q,, Enqueue(Append(q,, q), i) 


Hyp. { ; | | Rule(11) 


Enqueue(A ppend(q,, Append(q,, q)), i) Enqueue({Append(q,, Append(q,, q)), i) 
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3 theory of S. This is because the inductive hypotheses necessary for the proof cannot be 


generated automatically in all situations. However, when S satisfies the principle of 
definition a significant number of interesting properties in the inductive theory can be proved 


automatically. The automatic method, first developed by Musser [38, 22], is based on the 


’ Knuth-Bendix algorithm (see sec 3.3.3.1) for checking convergence of a rewriting system. We 


‘ use this method for synthesizing implementations whose proofs of correctness need 


induction. We will explain the method -in chapter 4 while describing synthesis in the 


- inductive theory. 
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‘4, Stage 1: The Preliminary Implementation 


, This chapter discusses the preliminary implementation of a data type, and develops 
, a method to derive it from the inputs to the synthesis procedure. A distinguishing 
_ characteristic of the method outlined is that it is based on a method for proving the 
, correctness of a preliminary implementation. The chapter is organized into the following 
. sections. The first section defines precisely what constitutes a preliminary implementation. 
The second section gives a mathematical formulation of the problem involved in the 
derivation of a preliminary implementation for a data type from the given inputs. For 
convenience, the problem is formulated, and solved here for a situation where the 
representing domain is identical to the representation value set. In the next chapter, we 
extend the derivation problem to the more general situation where the representing domain is 
a subset of the representation value set. The last section describes a procedure to derive the 


preliminary implementation from the input specifications. 
4.1 A Preliminary Implementation 


A preliminary implementation of a data type is an implementation for the 
implemented type in a rewrite rule language. The preliminary implementation uses a 
representation scheme that is consistent with the one characterized by the association 
specification supplied by the user. It consists of two parts: The Representation part, and the 
Definitions part. 

The Representation part gives the representation type used for the implementation 
of the implemented type. We call the values of the representation type the representation 
values, and the set of representation values the representation value set. Only a subset of the 
representation value set need be used to represent the values of the implemented type. This 
subset is called the representing domain, and is characterized by the association specification. 

The Definitions part contains definitions for a set of new functions on the 
representation values. We call the new functions the implementing functions. There is an 
implementing function corresponding to every operation of the imolemented type; the 
former implements the latter. The definition of an implementing function that implements 
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* an operation is called the preliminary implementation of that operation. :An implementing 
function is not necessarily a total function on the representation value set. However, it has to 

_ be defined on every value of the representing domain. We use the following convention 
throughout the development to help associate an implementing function with the operation 
of the implemented type it implements: The identifier that denotes an implementing function 

. is the capitalized version of the identifier that denotes the corresponding abstract operation. 
For instance, NULLQ is the implementing function of the operation Nullq. 

The Definitions part consists of a set of rewrite rules of the form ee, The 
rewrite rules in the Definitions part defining an implementing function F are the ones that 
have F as the outermost symbol on their left hand side. e, and e, are expressions that may 
contain the implementing functions, the operations of the implementing types, and 


if_then_else with the following constraints: 


(1) The only operations of the representation type that may appear in e, and e, are the 
generators of the type. 


(2) e, and e, may not contain any auxiliary (or helping) functions other than 
if_then_else. ) 


There are two reasons for constraining the preliminary implementation. Firstly, we 
would like to constrain the structure of the preliminary implementation so that the synthesis 
procedure has to perform less work in searching for the desired solution. Secondly, we want 
to keep the language as simple as possible so that the principle behind the synthesis method is 
brought out more clearly in our description. 

The first constraint is imposed to keep the preliminary implementation derivation 
problem simple. This constraint permits us to ignore several axioms in the specifications of 
the implementing types during verification as well as synthesis of a preliminary 
implementation. In particular, the only axioms in the specification of the representation type 
that we need to consider are the ones that involve only the generators of the type involved in 
the specification. This is because only the generators of the representation type may appear 
in the preliminary implementation. To this extent this constraint simplifies the synthesis 
method. An implementation that also uses the rest of the operations is derived in the next 


x stage of the synthesis as a transformation of the preliminary implementation. 
The second constraint, in general, restricts the logical power, ‘i.e., the ability. to 
‘define any computable function on the representation type, of: the preliminary 
implementation language because the constraint prohibits the use of any helping (or - 
auxiliary) functions (except if_then_else) in a preliminary implementation. Our synthesis 
_ method cannot automatically discover the helping functions that might be necessary in the 
preliminary implementation. We use two approaches to get around this problem; both the 
approaches amount to relaxing the second constraint, They are explained here briefly, but 

are illustrated more clearly when we later consider examples involving them. 

The first approach consists of seeking help from the user. We require the user to 
furnish a specification of the helping function needed in the preliminary implementation. 
We then relax the second constraint to permit the use of the helping function in the 
preliminary implementation. 

The second approach consists of introducing a new construct into the preliminary 
implementation language. The construct, which is used primarily in conjunction with a tuple 
type, helps eliminate the need for helping functions while defining several functions on tuple 
types. The motivation for paying special attention to tuple type is because a tuple type is a 
commonly used representation type. The construct provides a way of accessing the 
components of a tuple being returned by an expression of tuple type without explicitly using 
the operations that select the components of a tuple. This construct may be used in 
expressions that appear on the right hand side of an equation of a preliminary 
implementation. The construct is expressed by means of an expression with. the following 
syntax: 

e, where Viscees v? is e, . 
In the above, v,,...,¥, are variables; e,, is an expression of n-tuple type; e, is an expression 
that may contain the variables Vysceos Vy The construct binds, in order, v,,...,V, to the 
components returned by e,,. The scope of the binding is limited to the expression e,. For 
example, consider the expression 
<Assign(v1, e, jl), i1, j1+ I> where <v1, i1, j1D is DEQUEUEK<y, i, D). Assuming 
DEQUEUE is a function from Triple to Triple, the variables vi. and j, in the above 
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expression are bound to the components of the triple returned by DEQUEUE(<y, i, >). 
_ 4.2 The Preliminary Imlementation Derivation Problem 


Our intention is to study the problem of synthesis within the data type verification 
_ framework. So we formulate the problem of deriving a preliminary implementation as 
roughly the inverse of the problem of proving the correctness of the preliminary 
- implementation. , 

First, we develop ‘the criterion of correctness of a preliminary implementation. 
Then, we formulate the problem of verifying if a preliminary implementation meets the 
’ correctness Gitenion. We define the derivation problem after that. For convenience, the 
verification problem and the derivation problem are formulated here for a situation in which 
the representing domain is identical to the representation value set. This situation 
corresponds to the case where the abstraction function is total, and the invariant part of the 
association specification is vacuous. We discuss the derivation problem for a situation where 
the representing domain is a subset of the representation value later, It should be noted that 


the formulation of the correctness criterion given below applies to both situations. 
4.2.1 The Criterion of Correctness 


Informally, for a preliminary implementation to be correct, the implementing 
functions it defines should collectively exhibit a behavior that is consistent with the 
observable behavior characterized by the specification of the implemented type. Also, the 
preliminary implementation should use a representation scheme that meets the requirements 
of the association specification given as input. Let us formalize this intuitive notion. 

The formal object that a preliminary implementation is denoting can be considered 
to be a heterogeneous algebra, called the implementation algebra, with the following 


components: 


(i) A principal domain that is a subset of the representation value set. The principal 
domain is defined as the set of all values of the representation type that are 
“reachable” through the implementing functions corresponding to the constructors 
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of the implemented type. In other words, the principal domain is the set of 
representation values generated by the closure under functional composition of the 


implementing functions corresponding to the constructors of the implemented type. 


(ii) A domain corresponding to every defining type of the implemented type. We 
assume that this domain is'identical to the value set of the corresponding defining 


type. 


(iii) a function corresponding to every implementing function defined by the preliminary 


implementation, 


A preliminary implementation is correct if the implementation algebra it denotes is 
a model of the implemented data type in a manner constrained by the association 
specification. This means that there exists a homomorphism from the implementation 
algebra to the the implemented type that behaves as an identity function on the values of the 
defining types, and exactly like the abstraction function characterized by the association 
specification on the values of the representation type. . . 

Let & denote the representing domain, and A denote the abstraction function 


specified by the association specification. Let 4 be a function defined as below. 


D: Implemented Type, &: Representing Domain, D,--- -D:The defining types of D 
%6:%UD,U...UD,>DUD,U...UD, 

A:%->D . 

4(r) = A(r) ifreR 


r otherwise 
A preliminary implementation of a data type is correct with respect to the association 
specification A , if the following two conditions hold. 


(1) Totality Property:Every implementing function is total over %. 


(2) Homomorphism Property. The operation f of the implemented type and the 
implementing function F are related by the property: 
(Vr € BIKE... F --0)) = £6 BO) vo) 
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The correctness criterion formulated above is different from the formulation found 
in the literature on data type verification [25, 14, 18] which is not formulated with respect to a 
; given homomorphism %, According to the conventional formulation a preliminary 
implementation is correct if there exists a function 36 from the representation value set to the 
value set of the implemented type so that: For all r€the principal domain, 
: (FG... 1,..)) = fi... 36(r),...). Thus, according to this criterion the implementing functions 
| are not required to be total with respect to%. Note that the principal domain can be a subset 
" of &. What distinguishes our formulation is the requirement that F be total over %, and also 
~ satisfy the homomorphism property over %. 

Our formulation is more useful in the context of synthesis. It enables us to 
determine a principal domain of the implementation algebra (which, in turn, determines the 
set of representation values on which every implementing function should be defined) 
directly from the association specification. This reduces the interdependence of the synthesis 
of preliminary implementation for the various operations of the type. This is because in other 
formulations the principal domain has to be determined by computing the closure under 
composition of the implementing functions of the constructors. Thus the domain of the 
implementing function of each of the constructors is, in general, dependent on the behavior 
of the implementing function of every other constructor. 

The totality requirement is also more interesting in the context of synthesis. In the . 
synthesis process the association specification initiates the derivation of an implementation by 
defining the representation scheme to be used. The association specification is expected to 
express the intention of the user regarding the representation scheme he wants the 
implementation (to be derived) to use. So it is logical to assume that the user wants the entire 
representing domain characterized by the association specification to be used for representing 


the values of the implemented type. 


4.2.2 The Derivation Problem =; 


The goal of the derivation problem is to derive a preliminary implementation from 
_ the given inputs so that the preliminary implementation meets the correctness criterion. The 
inputs consist of the specification of the implemented type, the specification of the 
| implementing types, and the homomorphism specification. The homomorphism specification 
| is a specification of the homomorphism 36 that the preliminary implementation ought to 
obey. This specification is easily derived from the specification of the abstraction function A 
| (given as a part of the association specification). The Honionvorphishn Specification contains 
two kinds of rewrite rules obtained as described below. The first set of rules specifies that 36 
behaves exactly like the abstraction function on the representation values. The second set of 
rules specifies that 36 behaves as an identity function on the values of all the ancillary types. 
| More precisely, 
(1) if A(e,) = e, belongs to the abstraction function specification 
then J6(e,) = e, belongs to Homomorphism Specification 
(2) if o isa generator of an ancillary type 
then H(alv,, ~++5V,)) = o(J6(y,),..., 36(¥,)) belongs to Homomorphism Specification 
Let us call the combination of all the input specifications the Input World (IW). The 
restrictions on the inputs (see sec 2.3.1 of the previous chapter) ensure that the Input World 
satisfies the principle of definition. The strategy behind the method used in deriving the 
preliminary implementation is based on the principle of definition property. | 
Suppose IW is supplemented with a set of rewrite rules, called the %-rules, that 
express the homomorphism property a preliminary implementation is expected to satisfy: For | 
every pair of an operation f of the implemented type, and its implementing function F there 
exists an %-rule of the form 3(F(¥,,,..., v,)) _ {6(V,).. -++, J6(¥,)). Let us call the 
supplemented system the Perturbed World (PW). Let us suppose that the addition of the 
46-rules does not destroy the uniform termination property of [W. The reason we refer to the 
supplemented system as the Perturbed World is because the addition of the 36-rules destroys 
the principle of definition property. PW does not satisfy the principle of definition because 
the implementing functions that are newly introduced into the system are as yet undefined. 
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4A constant involving the implementing function symbols does not simplify to a generator 
-constant, | 
j Recall that the principle of definition is a formal expression of the requirement that 
every nongenerator function in a system be completely defined as a total function. If we can 
generate a set of rewrite rules that can restore the principle of definition property of PW, then 
the new set of rules can be considered as a complete definition for the implementing 
functions, Thus, preliminary implementation derivation is a problem of restoring the 
principle of definition of a system that violates it. 
More precisely, the problem involved in synthesizing a preliminary implementation 
consists of deriving from the Perturbed World a set of rewrite rules, PI (the acronym stands 


for preliminary implementation), so that 
(1) PIU IW satisfies the principle of definition, as well as 
(2) PIU PW satisfies the principle of definition. 


In the following, we give a formal proof that the above conditions guarantee the correctness 


of the preliminary implementation. 


The Correctness Theorem 


Let PI be a set of rewrite rules derived so that the above two conditions hold. Then, PI 


satisfies the criterion of correctness of a preliminary implementation. 


Proof The first condition asserts that PI U IW satisfies the principle of definition. This 
implies that every nongenerator function in the system, which includes every implementing 
function, is defined as a total function. Hence, PI satisfies the Totality Property. 

To show that PI satisfies the Homomorphism Property, we have to show that every 
equation of the form 36(F(V,.,.-., V,)) = f(J6(v,),, ..., J6(v,)) is a theorem of PIU IW. The 
argument to show that the second condition implies this is based on the following interesting 
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* result about any system that satisfies the principle of definition. The result;!4 which is proved 
as Theorem 6 in Appendix III, enunciates a sufficient condition for an equation to be a 
‘theorem of a system that satisfies the principle of definition. Suppose S is a system that 


* satisfies the principle of definition, and e, =e is an equation so that e, and e, have at least 


2 
one nongenerator function symbol in them. Then, e, = e, isa theorem of SifSU {e, 4 e,} 
satisfies the principle of definition. The result is proved in the Lemma to follow. 

Because of the second condition PI U PW satisfies the principle of definition. Since 
PW is IWU %-rules, this implies that (PI UIW)U 46-rules satisfies the principle of 
definition. Now, by the first condition PI U IW satisfies the principle of definition. By 
applying the above result, each of the %6-rules (when treated as equations) is a theorem of 
PIUIW. Note that the result can be applied because the 3¢-rules have nongenerator 
function symbols in them. . 
QED. 


4.3 Derivation of a Preliminary Implementation 


In the previous section the problem of deriving a preliminary implementation was 
formulated as deriving a set of rewrite rules, PI, so as to restore the principle of definition 
property to the Perturbed World PW. This section develops a procedure to derive a 
preliminary implementation. The procedure makes. two assumptions about its input: (1) The 
Initial World (IW) satisfies SCPD, a sufficient condition for the principle of definition, and 
(2)a termination ordering > on expressions is available to the procedure to ensure the 
uniform termination property of rewriting systems. . 

The obvious strategy for the procedure is to derive the rules of the preliminary 
implementation so that PI UIW and PI U PW satisfy SCPD. But this limits the class of 


14. [22, 38] contain results similar to the one proved in this lemma. The result here is different 
because we have a different set of assumptions. The principle of definition property used in [22] is 
more constrained than the one we have. The result in [38] assumes that S satisfies a completeness 
property called fully specifiedness which is not assumed here. This is the reason for the requirement 
in the lemma that e, and e, should have at least one nongenerator function symbol in it. 


ie 


- implementations that can be derived by the procedure. So, we develop another set of 
conditions, called the synthesis conditions, that is weaker than SCPD. PI is generated so that 
it satisfies the synthesis conditions.. It can be shown that when PI satisfies the synthesis 
: conditions, PI U IW and PI U PW satisfy the principle of definition. We first formulate the 
- synthesis conditions, and then develop a procedure to derive a set of rules that satisfies the 


synthesis conditions. 
_ 43.1 The Synthesis Conditions 


The synthesis conditions for a set of rewrite rules PI are the following: 
(1) Totality Condition: 


(a) PI is well-spanned (for every implementing function) with every rule in it 
being of the form F(g,,...,.)—+t,/5 where F is an implementing 


function symbol, and g,,...,g, are generator expressions. 


(b) PI satisfies the ur:iform termination property. 


(2) Uniqueness Condition: PI has the unique termination property. 


(3) Homomorphism . Condition: For every rule F,,..., g,) —t in PI, 
I(F(g,,---»8,)) = J6(t) is a theorem of PW. 


The following Synthesis Theorem shows that when PI satisfies the synthesis conditions, 
-PLUIW and PIU PW satisfy the principle of definition, and hence, by the Correctness 
Theorem, PI is correct. An informal motivation for the conditions can be given as follows. 
The Totality Condition ensures that every implementing function is defined on all the values 
of the representation type, and it terminates on each of them. The Uniqueness Condition 
ensures that every implementing function is well-defined, in the sense that it yields a unique 


value for every argument value. The Homomorphism Condition ensures that the preliminary 


15. Note that the syntactic constraint on a preliminary implementation requires that t may contain 
neither the function symbol 3, nor any of the operations of the implemented type. 


ea 


+ implementation satisfies the Homomorphism Property. 


_ The Synthesis Theorem 


“If PI satisfies the synthesis conditions, then PI U IW and PI U PW satisfy the principle of 


definition, and hence PI is acorrect preliminary implementation. 


Proof It is easy to see that PI U IW satisfies the principle of definition because the Totality 

Condition and the Uniqueness Condition imply that preliminary implementation satisfies 
SCPD, and IW satisfies SCPD by our assumption about the inputs. 

Let NW denote PI U PW, for convenience. We apply Theorem 8 (Appendix III) to 

show that NW satisfies the principle of definition. According to that theorem, a rewriting 


system S satisfies the principle of definition if 
(a) S is well-spanned, 
(b) S has the uniform termination property 
(c) Every critical pair <a,, a, of S is such that a, = a, is a theorem of S. 


We show that NW satisfies all three premises of the above theorem. NW is well-spanned. 
This is because TW is well-spanned by our assumption, and PI is well-spanned by Totality 
Condition (a). The only nongenerator function symbols of NW are the ones in IW and PI. 
By Totality Condition (b) PI has the uniform termination property, so NW has the uniform 
termination property also. The following lemma shows that NW satisfies premise (c). 

QED. 


Lemma Every critical pair <e,, e,> of NW is such that e, = e, is a theorem of NW. 


Proof Note that PW is convergent. This is because IW is convergent by assumption, and the 
36-rules added to IW do not give rise to any new critical pairs. 

NW is constructed from PW by adding PI to the former. Therefore, any new 
critical pairs of NW would be generated as a result of a superposition of the rules of PI on the 
rules of NW. Because of Totality Condition (a) on the form of the rules in PI the only rules 
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s on which the rules of PI can have a superposition are the following: 


_. (I) The rules of PI themselves, or 


ee 


. ” 


(I1) the rules of the implementing types, 
(IID) the 36-rules. 


Every critical pair <e,, e,> determined by a superposition on the rules in 
| category (1), and (II) is such that e,¥ is identical to e,+. This is because, by the Uniqueness 
’ Condition, PI satisfies the unique termination property. Hence, e, = e, is a theorem of NW. 
| Every critical pair determined by a superposition of the rules in category (II]) is of 

the form <He(F(E,, at g,)). I(t)>, where F(g,,...,g,)—+t is a rule in PI. By the 
Homomorphism Condition, 36(F(g,,..-,8,)) = 36() is a theorem of PW, and hence a 
theorem of NW. | 

Q.ED. 


4.3.2 Derivation of the Rules of PI 


The rewrite rules of PI are derived from the Perturbed World (PW). So the initial 
task of the derivation procedure is to construct PW. PW is a rewriting system that includes 
the Initial World (IW) and the 36-rules. IW is constructed by combining the specification of 
the implemented type, the specifications of the implementing types, and the Homomorphism 
Specification. Without any loss of generality, we assume that there is no conflict among the 
names of the various function symbols in the specifications. PW is formed by then adding a 
Tule of the form 36(F(V,,,...,¥,)) 4 f(36(v,),, ... , J6(v,)) for every implementing function F 

to be defined. We assume that the termination ordering > being used by the synthesis 
procedure is such that 1(F(V,,,...,¥,)) Ss {(36(v,),, ---,36(v,)), for every implementing 
function. This ensures that PW retains the uniform termination property as desired by the 
derivation problem. Note that this is not a restriction because the implementing function 
symbols (in the 36-rules) are fresh symbols being introduced into IW. Hence, an appropriate 

- ordering can always be found. 
Although PW is defined to include the specification of every implementing type | 
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. completely, it is not necessary to da so. Since the derivation method does not require the — 

" specifications to be complete, one: may include only parts of the specifications of the 

implementing types. The advantage of doing so is that the fewer rules in PW the more 

efficient it is to derive the preliminary implementation. However, by not including certain 

~ rewrite rules one might be excluding.certain implementations. 

| Let us illustrate the construction of PW on an example. We consider the derivation 
of an implementation for Queue_Int with Circ_List as the representation type using the 
association specification given in Fig. 9 in the previous chapter, Fig. 13 gives the rules of PW 
for the example under consideration: The rules of the types Integer and Bool, which are also 


among the implementing types are omitted from the figure for convenience. The rules of the 


Fig. 13. The Perturbed World 


(1) Front(Nullq) — ERROR 
. (2) Front(Enqueuc(Nullg, e)) — e 
(3) Front(Enqueuc(Enqueue(q, e1), e2)) + Front(Enqueue(q, e1)) 


(4) Dequeuc(Nullq) — ERROR 
(5) Dequeue(Enquecue(Nullg, e)) — Nullq 
(6) Dequeuce(Enqueue(Enqueue(q, e1), e2)) + Enqueue(Dequeuc(Enqueue(q, e1)), e2) 


(10) Append(q, Nullq) — q 
(11) Append(q1, Enqueue(q2, e2)) — Enqueue(Append(q1, q2), 2) 


(12) Empty(Nullq) — True 
(13) Empty(Enqueue(q, e)) — False 


(14) 36(Create) — Nullq 
(15) H(Insert(c, i)) + add_at_head(J6(c), 36(i)) 


(16) add_at_head(Nullq, i) —+ Enqueue(Nullg, i) 
(17) add_at_head(Enqueue(q, i), i1) + Enqueue(add_at_head(q, il), i) 


(19) (NULLQQO) — Nullq 

(20) J(ENQUEUE(C, i)) + Enqueue(36(c), 6G) 
(21) 6(DEQUEUE(c)) + Dequeue(36(c)) 

(22) 36(APPEND\(cl, c2)) — Append(36(c1), 36(c2)) 
(23) 6(EMPTY(c)) + Empty(36(c)) 
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« representation type Circ_List are omitted because they are not going to be used in the 
- derivation of the preliminary implementation. This situation’ arises because a preliminary 
implementation is permitted to use only the generators of the representation type. So, the 
only rules of the representation type needed in verification, and hence also in the derivation 
| of a preliminary implementation, are the ones that contain only the generators. Since 
: Cire_List does not have any rules of this kind, Circ_List does not contribute any rules to TW. 
Rules (1) through (13) in the figure are rules of Queue_Int; rules (14) through (17) are the 
rules of Homomorphism Specification. 

The next task is to derive the rewrite rules of PI from PW. Strictly speaking, PI 
should be derived so that all the three synthesis conditions are satisfied. But, it is more 
convenient to develop a procedure that derives the rewrite rules so that only the Totality 
Condition and the Homomorphism Condition are met. The effect of ignoring the 
Uniqueness Condition is not harmful in the sense that it can be fixed at a later stage by 
post-processing the preliminary implementation. The Uniqueness Condition ensures that 
every implementing function defined by PI returns a unique value on every representation 
value. When the Uniqueness Condition is not satisfied, an implementing function F being 
defined by PI may be nondeterministic: That is, F can be so that F(v) = vy and F(y) = > 
but v, #v,; however, both the values vy, and v, will represent the same value of the 
implemented type. The nondeterministic behavior, if any, in the preliminary implementation 
will be eliminated by our synthesis procedure in the second stage while deriving a target 
implementation. The semantics of the target implementation language is such that it is 
impossible to define nondeterministic functions. 

The procedure derives the preliminary implementation for one operation at a time 
by deriving a separate set of rewrite rules for every operation. The method used is the same 
for every operation. The procedure first determines the left hand sides of all the rules of the 
preliminary implementation. Then, it determines a suitable right hand side for each of the 


rules from the already determined left hand side. 
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» 4.3.2.1 Determining the Left Hand Side 


The Totality Condition is used to determine the left hand side of the rules. The 
_ Totality Condition has two parts: The first part requires PI to be well-spanned, and the 
_ second part requires PI to have the uniform termination property. The second part is 
| ensured while deriving the right hand side, which will be discussed later. The first part is 
used here. | 

| The well-spannedness property (described formally in sec 2.3.1 of the previous 
| chapter) requires the left hand side expressions of the rules defining an implementing 
function F to satisfy the following property: The set of generator expressions the appear as 
arguments to F on the left hand side should span the set of all generator constants. More 
' precisely, suppose the preliminary implementation of F consists of the following set of rules: 
(In the following the question mark identifiers are used as place holders for expressions to be 


determined later.) 


Then, the set {g,,...,g,} should be well-spanned (see sec 2.3.1), i.¢., span the set of all 
generator constants of the appropriate implementing type. For instance, as a concrete 
example, any pair of rules that have the form given below constitute a well-spanned set of 
rules for ENQUEUE. 

ENQUEUE(Create, j) — rhs, 

ENQUEUL{Insert(c, i), j) + ?rhs, 

Note that the left hand side of each of the above rules consists of ENQUEUE 
applied to arguments that are generator expressions. The set of arguments, i.e., sequences of 
generator expressions, to ENQUEUE on the left hand side of the rules is 
_ ArgsSet = {<Create, j>, <Insert(c, i), j>}. ArgsSet spans the set of all ordered pairs of 
generator constants because every pair of generator constants (the first one of type Circ_List, 
and the second of type Integer) is an instance of one of the arguments in ArgsSet. 

It is easy to build a procedure that automatically generates a well-spanned ArgsSet, 
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¥ once the generators of the representation type are identified. In fact a slight modification to 
the procedure referred in sec 3.3.3 (which checks if an ArgsSet is complete) can be used to 

_ generate a complete set of argument expressions. Thus, an appropriate set of left hand sides 

, for the rewrite rules to be derived can be determined automatically. 

. Fig. 14 gives a possible set of left hand side expressions for a preliminary 

_ implementation for the example under consideration. Note that the right hand side of each 
of the rules in the figure is denoted by a question mark identifier. So Fig. 14 can be 


considered as a partial preliminary implementation of Queue_Int. 
4.3.2.2 Determining the Right Hand Side 


The right hand side of each of the rules is determined using the already determined 
we left hand side so that the Homomorphism Condition and’ the second part of the Totality 
Condition are met. This where the Perturbed World (PW) conies into the picture. 

PW is used to derive a set of equations, called the synthesis equations, one equation 
for every rule in the preliminary implementation. The right hand side of a rule is determined 


from the right hand side of the corresponding synthesis equation. The synthesis equation 


Fig. 14. A Partial Preliminary Implementation 
(1) NULLQO — ?rhs, 


(2) ENQUEUE(Create, j) — ?rhs, 
(3) ENQUEUE(Insert(c, i), j) — ?rhs, 


(4) FRONT(Create) — ?rhs, 
(5) FRONT(Insert(c, i)) — ?rhs, 


(6) DEQUEUE(Create) — ?rhs, 
(7) DEQUEUE(insert(c,i)) —+ rhs, 


(8) APPEND(c, Create) —+ ?rhs, 
(9) APPEND(c, Insert(d, i)) + ?rhs, 


(10) SIZE(Create) —+ ?rhs,, 
- (11) SIZEMnsert(c, i)) — ?rhs,, 
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» corresponding to a rewrite rule F(g,) — ?t, is an equation of the form (F(g,) = 36(?t,) that 


. Satisfies the following conditions: 
(1) 4(F¢,) = K(*t,) isa theorem of PW 
(2) 316(F(g,) > 6(2t,), where > is the termination ordering on expressions. 


(3) ?t, contains the implementing function symbols and the permitted operations of the 


implementing types. 


it is easy to see the justification for the above ‘conditions. ‘The first condition 
contributes towards ensuring the Homomorphism Condition. The second condition ensures 
the uniform termination property. The third condition is just a syntactic constraint that any 
rule in a preliminary implementation ought to satisfy. The next section describes in detail a 


procedure to derive the synthesis equations. 
4.4 Deriving the Synthesis Equations 


Every synthesis equation of the preliminary implementation is derived with the help 
of two inference rules called the synthesis rules. The synthesis rules are designed for 
generating theorems of PW that have the same left hand sides, but different right hand sides. 
For deriving a synthesis equation, the synthesis rules are invoked repeatedly a finite number 
of times to generate a series of theorems until the desired equation is generated. For instance, 
the synthesis equation corresponding to the rule ENQUEUECInsert(c, i), j) + ?rhs, (in the 
partially derived preliminary implementation given in Fig. 14) is derived by generating a 
series of theorems that have 4(ENQUEUE(Insert(c, i), j)) as their left hand side. The 
generation continues until a theorem whose right hand side qualifies the theorem to be a 
synthesis equation is encountered. 

We investigate two ways in which the synthesis miles can be used for deriving a 
synthesis equation. The first one derives synthesis equations that are in the equational theory 
of PW. The second one derives equations that are in the inductive theory. The second 
method is more general than the first one. A system that implements the synthesis procedure 


would, therefore, use only the second method. We discuss them separately for pedagogic 
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“ reasons, First, we formulate the synthesis rules. The subsequent subsections describe the use 
. of the synthesis rules in deriving the synthesis equations. 


t 


"4.4.1 The Synthesis Rules 


The idea used for generating an equation is to reverse the method of demonstrating 
’ that the equation is a theorem of PW. The central notion used in the generation is 
expansion. Expansion is the opposite of reduction. It is the act of applying a rewrite rule to 


an expression from right to left. 
: 4.4.1.1 Informal Explanation 


The basis for the synthesis rules is the result given in the KB-Theorem (sec 3.3.3.1). 
The theorem gives rise to the following principle for generating equations that are theorems 
of a convergent system. Suppose e, is an expression that we wish to have as the left hand side 
of the equation. Then, an expression ?e, that may appear on the right hand side of any 
equation that has e, as its left hand side should be such that e,4 = 7e,4. One way of 
ensuring that ?e, simplifies to e,4 is to obtain 7e, by applying to e,4 the rewrite rules of the 
system from right to left a finite number of times. We call the mechanism of applying a rule 
to an expression from right to left expand. 

We will give a formal definition of expand; and discuss its properties later. Here, we 
will give an approximate description of what expand does so that we may develop a first 
version of the synthesis rule, and illustrate them on the example.16 Like reduce, performing 
expand consists of several steps. Suppose we wish ‘to expand 
Add_at_head(Enqueue(36(c), 36())), 36(i)) using the rule 
4%(ENQUEUE(e, j)) > Enqueue(36(c), 56). One way of doing this is to look for a 
subexpression (inside the expression to be expanded) that has the form of the right hand side 


16. We will generalize the definition of expand later. At that point one of the synthesis rules needs to 
revised slightly as well. According to the definition given here, expansion is identical to the 
transformation technique folding used by Darlington [7] for synthesis of recursive programs. 
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‘ of the rule. Then replace the subexpression by the corresponding instance of the left hand 
side of the rule. In the present case, the subexpression that appears as the first argument to 
'. Add_at_head in the given expression matches the right hand side of the rule for the identity 
substitution. The result of expanding the expression is then 
Add_at_head(6(ENQUEUE(c, jp, 36(i)). The result of expanding an expression e in the 
occurrence u by a rule y > 6 is denoted by expand e in u by y— 5. We use expand(e) to 
' denote any expression that is obtained by expanding e in some occurrence u by some rule 
y — 6 in the rewriting system under consideration. 
We are now in a position to give the synthesis rules. The first rule specifies how to 
start the generation of a series of theorems; it generates a theorem from a given expression 
without the need for any existing theorem. 


e is an expression 
1: € 1S an expression 
Rule eel 


The second rule specifies a way of generating a new theorem from an existing one using 


expand. 


e =e 
Rule 2: = expand(e,) 


To familiarize the reader with the synthesis rules let us invoke each of the synthesis rules to 
generate a couple of theorems that have 3%(ENQUEUE(Insert(c, i), j)) as their left hand. We 
use the rewrite rules of PW given in Fig.pwl for expansion and reduction. The normal form 
of 3(ENQUEUE(Insert(c, i), j)) is Enqueue(Add_at_head(36(c), 36(i)), 36(j)), which is 
obtained by using the rewrite rule (20) and then (15) for simplification. By invoking synthesis 
tule (1) with e = 36(ENQUEUL(Insert(c, i), j)), we generate the following theorem of PW: 


46(ENQUEUE(insert(c, i), j) = Enqueue(Add_at_head(36(c), 36(i)), 66) 


_ Let us now invoke synthesis rule (2) on the above equation. Using the rewrite rule (17) to 
expand the entire expression on the right hand side of the above theorem, we can generate 


the following theorem of PW: 


36(ENQUEUE(Insert(c, i), )= Add_at_head(i6(ENQUEUE(, j)), 36(i)) 
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' 4.4.1.2 Formal Definition of Expand 


Expansion is roughly the reverse of the process of reduction. The relation that 

‘characterizes a single step of expansion is called expand. Expanding an expression using a 
rule is close to ‘applying the rule to the expression from right to left. 

. The motivation for introducing the mechanism of expansion is to solve a common 
problem encountered during synthesis: This is to find an expression (a desired expression) 
that simplifies to given expression (the starting expression). For instance, in the derivation 
shown earlier, the starting expression was Enqueue(Add_at_head(36(c), 36(i)), 36@j)), and the 
desired expression was 3(Insert(ENQUEUE(c, j), i)). 

The definition of expand uses the concept of unification, and the most general 
unifier (see Appendix I). Let t be an expression, and y + 6 be a rule. We assume that t and 
y have disjoint variable sets. If there are common variables then they have to be renamed 
suitably. Let u be an occurrence in t such that t/u.is unifiable with 5; let @ be the most 
general unifier. Let t' be the expression t[u — 6(y)]. Then, we say that t expands to t' by 
y — 6 in u; we denote this relation by t — t’. Notice that expanding t by y — 6 in u is not 
equivalent to reducing t by 8 — y in wu. Expand checks if t/u is unifiable with 8, whereas 
reduce checks if t/u has the form of 5. Therefore, there are situations where an expression is 
expandable by y — 6, but not reducible by & — y. 

The following question arises immediately: Why was expand not defined exactly as 
applying a rule in the reverse direction ? The reason is that a rule y -+ 5 may be such that 
varset(y) > varset(5). Applying such a rule from right to left will result in an expression that 
contains "new" variables, i.e., variables that did not exist in the original expression. The use 
of such variable dropping rule during reduction represents a situation where the reduction 
step caused a “loss” of information: A new variable introduced in an expansion step might 
have had in its place an arbitrary expression during the corresponding reduction step. Our 
goal is to reconstruct, if possible, this lost information at a later stage in the expansion process. 
During expansion, therefore, a variable in an expression has to be treated, in general, as 
though an arbitrary expression might be in its place. Using the predicate unifiable to 


determine if an expression is expandable enables us to do this. 


For instance, consider ithe expansion of Append(q, Nullg) by the rule 
Dequeue(Enqueue(Nullg, e)) + Nullq. The resulting - expression is 
~ Append(q, Dequeue(Enqueue(Nullg, @))). The variable e is a new variable introduced because 
of expansion. Every instance of the latter expression in which e is replaced by any other 
. expression reduces to the former expression. It might be possible to determine the expression 
. that has to take the place of e in future expansion steps. 

It should be pointed out, however, that not all variables in an expression need be 
given such a special treatment during expansion. The variables that appear in the starting 
- expression must appear as they are in the desired expression we are shooting for. Therefore, 
while expanding an expression, it is necessary to distinguish between the variables in the 
expression that were introduced by a rule (presumably during earlier steps of expansion) and 
the ones that were transferred to the expression from the starting expression. We classify the 


variables involved in expansion into the following two kinds: 
(1) The variables appearing in the rewrite rules; we continue to call these variables. 


(2) The variables appearing in the expressions on the left hand sides of the rewrite rules 
in the partially generated preliminary implementation (Fig. 14). We call these 
variables terminals. Henceforth, we denote terminals by identifiers that are in 
italics. | 
The definition of an expression remains as before except that it may also contain 

terminals in it. The definition of a substitution also remains as before; it is a function from 
variables to expressions. Thus, when a substitution is extended to be applicable on an 
expression, the terminals in the expression are not substituted for, as we desired. 

In the wake of the formal definition of expand, and the preceding discussion about 
the introduction of variables into expressions due to expansion, we should reconsider the 
formulation of the synthesis rules. The first synthesis rule remains unchanged because it does 
not use the relation expand. The second synthesis rule was formulated as below: 

e, =e, 


Rule 2: EPR ag 4 
e, = expand(e,) 


- 83- 


‘ This formulation is not general enough because it does not account for all the theorems that 
can be derived from e, =e, in one expansion step. If expand(e,) has variables in it, then 
every instance of it can potentially be the right hand side of a theorem. Hence, we 


re-formulate the rule as follows: 


e, =e,,0 is a substitution 
Rule 2: aa SESS PAEARER TO a 
| = o(expand(e,)) 


4.4.2 Derivation in the Equational Theory 


As an illustration, let us derive a synthesis equation that is of the form 
4(ENQUEUE(Insert(c, i), j)) = 36(?rhs,) in the partial preliminary implementation shown in 
Fig. 14. The equation is derived by generating a series of theorems that have 
46(ENQUEUE(Insert(c, i), j)) as their left hand side. The generation is begun by invoking 
synthesis rule (1) on the left hand side expression. The rest of the theorems in the series are 
generated by invoking synthesis rule (2) using the rewrite rules of PW for expansion. The 
rewrite rules for expansion are chosen with the following ultimate goal: Obtain aright hand 
side that has the form 36(?rhs,) so that 36(ENQUEUE(Insert(c, i), j)) = 26(2rhs,), and ?rhs, 
contains only the permitted operations of the implementing types. In the illustration given 
below, the generation of every theorem in the series is considered as astep. At each step, the 


expression expanded, and the rewrite rule used for expansion are indicated. 


Relevant Rewrite Rules of the Perturbed World 
(1) 36(ENQUEUE(e, j)) —+ Enqueue(36(c), 36) 


(2) 36(Create) — Nullg ; 
(3) 36(nsert(c, i)) + Add_at_head(36(c), i) 


(4) Add_at_head(Nullq, i) + Enqueue(Nullq, i) 
(5) Add_at_head(Enqueue(q, i), j) -+> Enqueue(Add_at_head(q, j), i) 


Form of the theorem to be generated: 76(ENQUEUE(Insert(c, i), j)) = J6(?ths,) 
Normal form of 4(ENQUEUE(Insert(c, i), j)): Enqueue{Add_at_head(36(c), i), 36(j)) 
Rules used for the normal form: (1), (3) 


Step (1) Invoke Synthesis Rule (1) on H/(ENQUEUE(nsert(c, i), j)) 
H(ENQUEUEInsert(c, i), j)) = Enqueue(Add_at_head(36(c), i), 36@)) 
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j Step (2) Expand Expression: Enqueue(Add_at_head(J6(c), i), 36())) 
Using Rule: (5) 


ee ee Pr rrr hr rrr Sennen ett ft ty 


36(ENQUEUE(Insert(c, i), j)) = Add_at_head(Enqueuc(36(c), 36(j)), i) 


" Step (3) Expand Expression: Enqueuc(‘J6(c), 36(j)) 
Using Rule: (1) 


6(ENQUEUF(Insert(c, i), j)) == Add_at_head(36(ENQUEUE(c, )), 


Step (4) Expand Expression: Add_at_head(J6(ENQUEUE(, j)), i) 
Using Rule: (3) _ 


46(ENQUEUE(Insert(c, i), j)) = 6(Insert(ENQUEUE(e, j), i) 


The theorem generated in step (4) qualifies to be a synthesis equation. 
Hence the desired rule of the preliminary implementation is: 
“ENQUEUF(Insert(c, i), j) + Insert(ENQUEUE(, j), i) 


44.3 Derivation in the Inductive Theory 
4.4.3.1 The General Strategy 


The method used for deriving a synthesis equation in the inductive theory is based 
on the following property that every theorem of PW satisfies: If an equation is a theorem of 
PW, then every instance of it is in the equational theory of PW. An instance of an equation 
e, =e, 
constants. 


is an equation obtained by replacing every variable in e, and e, by generator 


We, therefore, take the following approach. Suppose the synthesis equation we 
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~ wish to derive is of the form 36(F(e,,)) = s6(2e,,).17 We first derive an instance of the desired 
equation: This is done by selecting an instance of the left hand side, say o(36(F(e, .))), for 
. some substitution o of the terminals in e,, to generator constants. Then, an instance of the 
equation o(36(F(e, )) = o(36(e,,)) is derived; the method of derivation for the equational 
_ theory described earlier can be used for this purpose. The instance of the equation derived 
| should be such that a generalization of it 16(F(e,,)) = 36(e,,), which is obtained by replacing 
assorted constants by suitable terminals in the instance, is a theorem of PW. 

| To check if the generalization is a theorem of PW, we use an automatic procedure 
called Is-an-inductive-theorem-of. This procedure is capable of deciding a significant number 
of theorems in the inductive theory of a system. The procedure will be described in a 
subsequent subsection. Another topic that will be deferred until later is determining a 
suitable o. Any substitution that maps all the terminals in the left hand side of the synthesis 
equation to arbitrary generator constants will serve our purpose. However, the derivation 
would be more efficient if we instantiated as few terminals as possible. A later subsection will 
discuss a method of determining a more judicious way of choosing o. 

In the rest of this subsection, we formalize the notion of the generalization of an 
equation, and then illustrate the general strategy by deriving a synthesis equation 
corresponding to the rewrite rule APPEND(c, Insert(d,i)) — ?rhs, in the partial preliminary 
implementation of APPEND given in Fig. 14. 


The Generalization of an Equation 


The generalization of an equation e, = e, with respect to a substitution o is the set of 


1 


equations such that e, =e, is an instance of using o. When the substitution with respect to 


2 
which the equation is being generalized is obvious from the context, we denote the 
generalization by Genfe, = e,]. Formally, every equation e} = e} € Genfe, = e,] is such that 
o(e!) = e,, and o(e}) = e,. Note that if €, = e, has a finite number of function symbols 


Genfe, = e,] is always finite. For instance, suppose o is {d++ Create}. 


17. Recall that the left hand side of the synthesis equation is already known. 
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‘Then, Gen[s6(Append(c, Insert(Create, i))) = 36(4{ APPEND(ENQUEUK(c, i), Create))})) 
contains the following equations: 
46(Append(c, Insert(Create, i))) = 36({APPEND(ENQUEUE(c, i, Create))})) 
36(Append(c, Insert(d, i))) = 4(APPEND(ENQUEUE(c, J), d)) 


As an_ illustration Jet us derive an equation of the form 
6(APPEND(, Insert(d,i ))) = 36(?rhs,) which gives rise to one of rules in the preliminary 
implementation of Append. The derivation begins with the choice of the left hand side of the — 
instance of the equation to be derived: This has to be an _ instance of 

46(APPEND(c, Insert(di ))). Let us suppose o is {d+ Create}. 


Relevant Rewrite Rules of the Perturbed World 


(10) Append(q, Nullq) — q 

(14) 36(Create) — Nullgq 

(20) 36(ENQUEUE(e, i)) + Enqueue(36(c), 36G))) 
(22) JE(APPENIX(c, d)) + Append(J6(c), 36(d)) 


Form of the theorem to be generated: 36(APPEND\(c, Insert(Create, ))) = J6(?e) 
Normal form of 36(APPEND(c, Insert(Create, D)): E; sds 36(1)) 
Rules used for the normal form: 


Step (1) Invoke Synthesis Rule (1) on 36(APPEND(c, Insert(Create, 2))) 
36(APPEND(c, Insert(Create, ))) = Enqueue(36(0), 36()) 


Step (2) Expand Expression: Enqueue(36(0), J6(i)) 
Using Rule: (10) 


Se esa eS Beeches Pen EERE seme rese rHeEESeER seneeERarencemaenEnoEsEEaeD 


36(APPEND\(c, Insert(Create, i))) = Append(Enqueue(36(c), 36()), Nullq) 


Step (3) Expand Expression: Nullq 
Using Rule: (14) 


36(APPEND(c, Insert(Create, ’))) = Append(Enqueue(36(c), 36(1)), 16(Create)) 
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3 Step (4) Expand Expression: Enqueue(36(c), 36()) 


Using Rule: (20) 


+ Semmewe weseneese wen sneee nes caneneesener ee nsesen en nesees seesenenneee 


3G(APPEND(c, Insert(Create, ’))) = Append(J(ENQUEUE(c, i), 36(Create)) 


' Step (5) Expand Expression: Append(J6(ENQUEUE(c, 3), 36(Create)) 


Using Rule: (22) 


36(APPEND(c, Insert(Create, ))) = 36(APPEND(ENQUEUE(c, 2), Create)) 


Step (6) Generalize the theorem in step (5) by replacing the constant 
Create by the variable d to obtain the following equation: 
36(APPEND(c, Insert(d,i))) = J6(APPEND(ENQUEUE(., 3, d)) 
Apply Is-an-inductive theorem-of on the above equation. 

This yields True confirming that the equation is a theorem. 


Hence the desired rule (obtained by dropping 3% on both sides) is: 
APPENIXc, Insert(d,i)) + APPEND(ENQUEUK(c, i), d 


4.4.3.2. The Predicate Is-an-inductive-theorem-of 


Is-an-inductive-theorem-of is a procedure that is used for checking if an equation 


e, = e, is a theorem of a convergent rewriting system S. The procedure is designed so that if 


it yields true on e, =e, then e, =e, is a theorem of S; if it yields false, then nothing can be 
said about e, =e,. While deriving a synthesis equation in the inductive theory, the 
procedure is used to check if a generalization of an equation is a theorem of PW. The 
procedure is described here. . 

The procedure is based on a method of using the KB-algorithm (see sec.3,3,3.1) for 
checking the convergence for proving inductive properties of a rewriting system. Suppose S 


is a convergent rewriting system. To check if e, =e is a theorem of S, perform the following 


steps: 


2 
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(1) FormS, = SU {e, +e, (ore, + e,)}. 


(2) Check if S, is convergent. The KB-algorithm of checking convergence (which 
consists of checking if every critical pair <a,,0,> of S, is such that a= a,+) is 


used for this. 


If the result of step (2) is affirmative, then e, = e, is a theorem; otherwise nothing can be said 


2 
about it, in general. Let us assume that there exists a procedure, called 


Can-be-made-convergent, that implements this method. 


We will first briefly summarize the method, and then’ describe how 
Is-an-indactive-theorem-of is built on top of it. 

The result that provides a basis for the above method is proved in Theorem 7 in 
Appendix III which gives a few useful results about convergent systems. The result is similar 
to the one that was first developed by Musser [38], and that has also been investigated in [22]. 
Our result is different because the cited works assume that S satisfies a notion of 
completeness (similar to the principle of definition) besides convergence. | 

In the present situation PW, whose theorems we are interested in, is convergent but 
~ does not satisfy the principle of definition. - Because of this the above method is applicable 
only when e, (or e,) is such that for every instantiation of the variables by generator constants, 
e, simplifies to a generator constant. The left hand side of every equation we wish to check is 
of the form 36(F(g,,... »£,))s where F is an implementing function symbol, and g,,...,g, 
are generator expressions. Note that 36(F(g,,...,8,)) reduces to f(J6(g,,...,g,)) by the 
36-rule corresponding to F. The latter expression satisfies the desired condition since f and 36 
are well-spanned!8 by PW. | 

There are several situations when the method described above is not applicable for 
~ proving an equation e, = e,. But there exists another equation e| = e} such that 


18. Note that if a function f is well-spanned by PW, then every term of the form fit,,--+5 ty where 
tye. +, t, are generator terms, can be simplified to a generator term using PW. 


- 89 - 


(1) e1 = e} can be proved using the above method, 
(2) e, = e, is a theorem if e! = e! is a theorem, and 
(3) e! = e} can be derived automatically from e, = e,. 


In other words, e; =e} is serving as a lemma for the theorem e, = e,. The 


: procedure Is-an-inductive-theorem-of consists of transforming e, = e, to e} = e}, and then 


applying Can-be-made-convergent on e! =e}. The transformation of e, =e, toe} =e} is 
performed by a function L, called the lemma deriving function. The lemma deriving function 


used by Is-an_inductive-theorem-of is defined below: 


The Lemma Deriving Function (2) 
£ is a function on expressions. 2 can be used to derive for a given equation e, = e, a lemma 


that the proof of the former is dependent on. The two sides of the lemma are obtained by 


applying £ toe, and e,, 
ZL: expression -> expression 
Usage: L(a,) 


Pre: a, is of the form 36(a,), where «, does not contain the symbol 36. 


Returns: An expression £ that is obtained by replacing in a,4 every subexpression of 


the form 36(d), where dis any terminal, by a new terminal d,. 


We will now illustrate the procedure Is-an-inductive-theorem-of to check if the 
equation 36(APPEND(c, Insert(d,i ))) = 36(APPEND(ENQUEUK(c, i), d)) is a theorem of 
PW being used in our example. The equation was obtained in step (6) while deriving a 
synthesis equation in the previous section. 

Equation to be checked: J6(APPEND(c, Insert(d,i ))) = J6(APPEND(ENQUEUE(<, i), 2). 
Step (1) Derive Lemma by applying L: 


(a) Simplify both sides, 
(b) Replace 36(c) by q, 36(d) by R, 36()) by i 


+ 4G(APPEND(c, Insert(d, ))) ; 36(APPEND(ENQUEUE(e, ’), 3) 


: Append(36(c), Add_at_head(36(q), ())) Append(Enqueue(36(c), 36()), 36(d) 
~ Lemma to be checked: Append(q, Add_at_head(R, i)) = Append(Enqueue(q, i), R) — 


Step(2) Check if critical pairs are convergent: 
« (a) Critical pair determined by Rule (16): 


Append(q, add_at_head(Nullq, j))_ 
Append(Enqueue(q, j), Nullq) Append(q, Enqueuc(Nullq, j)) 


Enqueuc(q, j) Enqucue(q, j) 
(b) Critical pair determined by Rule (17): 


Append(q, add_at_head(Enqucue(r,, j,), j)) 
Append(q, Enqueuc({add_at_head(r,, j), j,)) Append(Enqucuc(q, j), Enqueue(r,, j,)) 


Enqueue(Aypend(Enqueue(q, j), 1,), j,) Enqueue(Append(Enqueuce(q, j), r,), i) 


4.4.3.3 An Instantiation for the Synthesis Equation 


Here, we describe a method of finding a substitution o that determines the left hand 
side of the instance of the theorem we wish to generate. Note that the left hand side of the 
theorem is already known to us which in the current example is 36(APPEND(¢, Insert(d, i))). 
o maps the terminals in the left hand side expression to suitable expressions. o should be 
chosen so that the equation o(36(APPEND(c, Insert(d, /)))) = o(36(?e,)) is in the equational 
theory of PW. This implies that o should be such that o(36(APPEND(c, Insert(d, ))))) and 
o(36(7e,)) have the same normal form. Note that J6(?e,) is unavailable to us at the moment. 
So, o has to be determined from the left hand side expression alone. Since the theorem 
46(APPEND(c, Insert(d, ) = 36(7e,) is not necessarily in the equational theory of PW, an 
arbitrary substitution that maps terminals to generator terms cannot be used. 


- The following fact about our proof method (for inductive properties) serves as the 


~91- 


: basis for the method of finding «. The basis step of the inductive proof can always be carried | 
out using the equational logic. So, we choose the o that corresponds to a basis step of the 
: proof of the lemma. The instantiation corresponding to the basis step can be determined 
- automatically starting from the left hand side of the theorem alone. 

Finding such a a involves two stages because the proof of the theorem, as you may 
. recall, involves two stages: Converting the theorem to the lemma, and then proving the 
_ lemma itself. We first determine a substitution w that corresponds to a basis step of the proof 
of the lemma. o is determined from w using the method used by the lemma defining 


function £ to convert the theorem to the lemma. We describe the two steps below. 


Step (1) Determination of w 


(a) Find the left hand side of the lemma. 
This is obtained by applying 2, the lemma defining function, to the left hand side of 
the theorem. For our example: Left hand side of the theorem is 
46(APPEND(c, Insert(d, ))). To obtain the left hand side of the lemma, we simplify 
the expression, and replace every subexpression that has 3% at the root by a new 
terminal: #(APPEND(c, Insert(d, )))) +* Append(36(c), Add_at_head(36(d), 36(/))). 
So the left hand side of the lemma is Append(q, Add_at_head(R, i)). 


(b) Find a basis step in the proof of the lemma 

For this, compute all the superpositions between the left hand sides of the rules of 
PW and the left hand side of the lemma. Simplify the superpositions. A sufficient 
condition for a superposition to correspond to a basis step is that its normal form is a 
generator expression. The most general unifier that determines such a superposition 
is a candidate o. The following table gives the result of performing the above steps 
on the current example. The columns, in order, give the rewrite rule in PW 
responsible for the superposition, the superposition, and the normal form of the 
superposition. The first superposition in the list simplifies to a generator expression. 
Therefore, w is the most general unifier corresponding to the first superposition, 
which is {R ++ Nullq}. 


- 92 - 


+ Rule Superposition (Superposition)+ 

. (16) Append(qg, Add_at_head(Nullq, J) Enqueue(g, ) 

(17) Append(q, Add_at_head( Enqueue(Append(q, 

Enqueue(r,, i). 9) Add_at_head(r,, )), j,) | 


Step (2) Determine o from w | | 

-w provides instantiations for the terminals in the left hand side of the lemma. o instantiates 
- the terminals in the left hand side of the theorem. Our objective is to find a o so that when 
the left hand sides (of the lemma and the theorem) are instantiated by o and w, respectively, 
they simplify to the same expression. . 

For instance, in the current example, the left hand side of the theorem is 
36(APPEND(c, Insert(d, 1))), whose normal fom is 
= Append(36(c), Add_at_head(36(d), 36())). The left hand side of the lemma is 
= Append(q, Add_at_head(R, i)), which was obtained by replacing 36(d) by r, and 36(c) by 


q. » maps r to Nullg, and leaves the rest of the terminals unchanged. Therefore, o should 


© 
l 


o 
I 


o 
| 


map d to an expression such that Nuliq = 36(d) is a theorem in the equational theory of PW. 
Therefore, the instantiation for d can be determined using the first two synthesis rules by 
generating a theorem that has Nullq on the left hand side, and an expression of the form 
46(?e) on the right hand side. The generation sequence is shown below. The first theorem is 
obtained by invoking Synthesis Rule (1) for the expression Nullg. The second theorem is 
obtained by using Synthesis Rule (2); rewrite rule (14) of PW is used for expand. The right 


hand side, 36(Create), of the theorem generated determines o as {d+ Create}. 


Nullq = Nullq 


= 36(Create) 


4.5 An Abstract Implementation of the Derivation Procedure 


Below, we give an implementation for a procedure Generate-a-rule. The procedure 
determines a suitable right hand side expression for a rewrite rule in a partial preliminary 
implementation given the left hand side expression. The procedure also expects a Perturbed 


World and a termination ordering as inputs. The procedure is implemented in a high level 
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algorithmic language whose semantics is self-explanatory. 
The implementation assumes that there exist two procedures 
: Is-an-inductive-theorem-of and A-suitable-instantiation-for-lhs. The lattér finds a suitable 
substitution that determines the instance of synthesis equation to be generated. 

The procedure performs essentially the theorem generation illustrated before in a . 

. systematic fashion. Roughly, it operates as follows. It finds the instance of the left hand side 
. of the synthesis equation by applying A-suitable-instantiation-for-lhs to J6(ihs). It simplifies 

this expression to its normal form. The normal form is then expanded repeatedly using 
‘ appropriate rewrite rules of PW until a suitable right hand side is encountered. 

The nontrivial aspect of the procedure concerns performing expansion in an 
effective fashion. There are two problem areas, Firstly, expansion is not uniformly 
terminating. That is, expansion is a potentially nonterminating activity. The procedure uses 
the termination ordering > to circumvent this problem. The right hand side has to be an 
expression that is less than the given left hand side. But, expanding an expression always 
gives rise to a bigger expression in the ordering >. Thus, the procedure can be terminated 
the moment we encounter an expression that is not less than the left hand side. (Note that the 
> is such that there can only be a finite number of expressions less than any given 
expression.) 

Secondly, expansion is not uniquely terminating. That is, an expression can be 
expanded in several different (but finitely many, because there are only finite number of rules 
in PW) ways using the rules in PW. All of them do not necessarily lead to the same final 
expression. Some of them may not even lead to a suitable right hand side expression. In the 
examples illustrated earlier, the rules of PW were carefully chioneii so that they resulted in the 
desired right hand side. A working implementation, however, is forced to keep track of all 
possible expansions since any one of them can result in the desired right hand side. In the 
implementation given below the variable S is used for this purpose. 

This chore, in fact, happens to be the main source of inefficiency in the synthesis 
procedure. We use the following obvious ways of getting rid of unproductive expansion 
paths. Firstly, type information is used to eliminate some of the candidate rewrite rules for 


expansion. Secondly, expansions that result in an expression that is not less than the left hand 
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- side are not going to be fruitful. Finally, we make a distinction between the variables that 
appear in the rewrite rules of PW, and the ones in the given left hand side. The latter, which 
° are terminals, are treated as constants. This eliminates several rewrite rules for expansion that 
‘ are candidates otherwise. 

It should be noted that the procedure given below is. only a part of a complete 
. implementation of the synthesis procedure. The other part is expected to determine the left 
: hand side of the rules. We have assumed that there exists a procedure to determine the left 
hand sides. If the following procedure does not succeed in finding a suitable right hend side 
for a given left hand side, then another set of left hand sides have to be generated, and the 


following procedure reexecuted. 


Generate-a-rule = proc (PW: Perturbed World, lhs: F(g,,--++8), 
>: ordering) returns (Rewrite Rule) 


%lnitialization 

o: Substitution «— A-suitable-instantiation-for-lhs 
ilhs + ofths) 

S + {36(ilhs)+} 


repeat 


%Test if expansion can be stopped 

if There-exists-a-suitable-candidate-in(S) 

then rhs ~~ Fetch-a-suitable-candidate-from(S) 
return(Ihs —+ rhs) 
endif 


%If a candidate has not been generated yet, expand by one more step 
Sl-@® 
for every t € S do 
$1 — S1 U set-of-all-expansions-of t by PW 
endfor 
S+S$l 


%Drop from SI unproductive expressions 
for every t€ S, do 
if ~(ths > t) then S1 + S1 - {t} 


forever 
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%Subprocedure description 


. There-exists-a-suitable-candidate-in: subproc (S: Sct{Expression]) returns (Boolean) 


‘if it € S such that 
3 I6(F(g,, ..., 2.) = 16(?rhs) € Genfilhs = ¢] such that 
(1) ?rhs does not contain 36 or operations of the implemented type, 
(2) F(g,, ae 8) > ?rhs, and 
(3) 1s-an-inductive-theorem-of-PW(J6(F(g,, ares g) = J6(?rhs)) 
then return(True) else return(False) 


end subproc 


%Subprocedure description 


Fetch-a-suitable-candidate-from: subproc.(S: Set[Expression]) returns (Expression) 


if ¢ € S such that 
3 I6(F(g,, .-., 8,) = J6(2rhs) € Genfilhs = ¢] such that 
(1) ?rhs does not contain 36 or operations of the implemented type, 
(2) F(g,, + +058) > Prhs, and 
(3) Is-an-inductive-theorem-of-PW(J6(F(g,, . .., 8,) = J6(?rhs)) 
then return(t) 
end subproc 


end Generate-a-rule 


set-of-all-expansions-of_by: Expression X Rule -> Set[Expression] 
Usage: _ set-of-all-expansions-of t by y + 6 
Returns: Returns the set of all possible expansions of a given term via a given rule. 


set-of-all-expansions-of_by: Expression X Set{Rule] -> Set{Expression] 
Usage:  set-of-all-expansions-of t by % 
Returns: The set of all terms s such that 

s = U set-of-all-expansions-of t by R, for all RE R 


expand_in_by: Expression X Occurrence X Rule -> Expression 
Usage: expand t, in u by y —+ 8 . 
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Pre: Varset(t,) M Varset(y) = ® : %For convenience 
t,/u is-unifiable-with § 

Returns: expand t, in u by y — 6 yields a term t, such that every term that reduces (in u by y + 5) 
to an instance oft, will be an instance of t,. In other words t, is the most general instance of 
all the terms that reduce (in u by y — 6) to an instance of t,. Note that the result the 
function returns is unique upto permutations of the variables. This is because 0, which is 


the most general unifier of two terms, is always unique when restricted to the variables in 
the two terms t, and 5. 


expands-to_in_by: Expression X Expression X Occurrence X Rule -> Bool 
_ Usage:  t, expands-to t, in uby y —+ 8 

' Pre: Varset(y) M Varset(t,) = ® 

' Returns: A predicate that tests if a term expands to another given term. 
(t,/u) is-unifiable-with 6 A t, = expand t, in ubyy— 
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5. Extending the Derivation Problem 


The derivation problem and the derivation procedure described in the last chapter 
apply to a situation in which the representing domain (%) for the desired preliminary 
implementation is unrestricted. That is, & includes all the values of the representation type. 


This section extends the problem to the more general situation where % is a subset of the 


- value set of the representation type. 


% contains the set of values that are permitted to be used by a preliminary 
implementation for representing the values of the implemented type. It is characterized by 
the association specification supplied by the user. Suppose A and § are the abstraction 
function and the invariant specified by the association specification respectively. Then % is 
the set of all values for which J is true. The present situation is one in which J is true on only 
a subset of the representation value set. 

For instance, consider the association specification given in Fig. 15. This example 
will be used to illustrate the procedure described in the chapter. It specifies an 
implementation of Queue_Int interms of Array_Int X Integer X Integer. The abstraction 
function .A can be described informally as follows. Nullq can be represented by any triple in 
which both the integer components are equal. A nonempty queue can be represented by a 
triple <v, i, >. v is an array of arbitrary length containing the elements of the queue, in order, 
between the index values i and j-1. In other words, i points to the front end of the queue, and 
j points to the next availabe position in v for adding a new element into the queue. The 


invariant J is true on all triples such that i < j and the array is guaranteed to be defined on all 


Fig. 15. Queue_Int in terms of Triple 


At<y, i, D>) = Nullg 
A(<Assign(y, e, j), i, j+1>) = ifi = j+1 then Nullq 
else Enqueue(A{<y, i, j>), e) 


I{<v, i, D) = True 

5(<Assign(y, e, j), i, j+1>) = if i = j+1 then Tre 
else if i < j+1 then 5{<v, i, >) 
else False 
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+ index values between i and j. 
~ §.1 Characterization of the Problem 


The criterion of correctness (stated in the previous chapter in Sec 4.2.1) that was 
* used to characterize the problem earlier is applicable in the current situation as well. For 
- convenience, we repeat the criterion below: A preliminary implementation of a data type is 
- correct with respect to an association specification (that characterizes an abstraction function 


A, and a representing domain %) if the following properties hold. 
(1) Totality Property: Every implementing function is total over %. 


(2) Homomorphism Property: The implementing function F and the operation f of the 
implemented type are related by the following homomorphism property: 
(VW r€ K)IC(F(..., r,..)) = f(..., 36(r),...)], where 36 is a function defined as: 
HO) = AMifres 


r otherwise 


Based on the above criterion, the derivation of a preliminary implementation was 
viewed earlier as a prcolem of finding a set of rewrite rules PI so that PI U IW and PIU PW 
satisfy the principle of definition. We still view the problem the same way. But, now the 
implementing functions need be defined only on the values in %, and the homomorphism 
property need only be verified on the values in h, This means that PI U IW and PIU PW 
need satisfy the principle of definition only with respect to a subset!? of the set of all 
generator constants of the representation type. This subset is the representing domain of 
constants T characterized by the association specification as follows: T = { t | 5(t) = True}. A 
proof of the claim that if PI UIW and PIU PW satisfy the principle of definition with 
respect to T, then PI is correct can be carried out along the same lines as the proof of the 
Correctness Theorem (Sec. 4.2.2). The proof for the present case can be obtained by 


19. A system S satisfies the principle of definition with respect to T if the every constant of the form 
Fig, ees g,)s where F is a nongenerator function symbol and &,.-- +» 8, are generatore constants in 


T, has a unique normal (in S) that is a generator constant in T. 
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- systematically replacing in the earlier proof the phrase "the principle of definition" by the 


phrase “the principle of definition with respect to T". 
_§.2. Derivation of a Preliminary Implementation 


First we formulate the synthesis conditions that are used as a guide in the derivation 
~ of a preliminary implementation, and then describe a procedure to derive a set of rewrite 
' rules PI that satisfies the synthesis conditions. The synthesis conditions are sufficient to 


ensure that PI U IW and PI U PW satisfy the principle of definition with respect toT. 
5.2.1 The Synthesis Conditions 


The synthesis conditions for a preliminary implementation PI are the following: 
(1) Totality Condition: 


(a) PI is well-spanned with respect to T (for every implementing function) 
with every rule in it being of the form F(g,,...,g,)— t, where F is an 


implementing function symbol, and g,,..., g, are generator expressions. 


(b) PI has the uniform termination property. © 


(2) Uniqueness Condition: PI has the unique termination property. 


(3) Homomorphism Condition: For every rule F(g,,..., 2,) — tin PI, 
4(g,) A... A 5(g.)9 => 6(F(g,, +» 8.) = 16(t) is a theorem of PW. 


(4) Invariance Condition: For every rule F(g,,...,8,) + t in PI, where the range of F 
is the representation type, 4(g,) A... A 4g.) => 5) = True is a theorem. 


It is interesting to note the effect of the presence of the invariant 5 on the synthesis 


20. Here, we assume that each of the expressions 8) +--+» 8, is of the representation type. If not, the 
antecedent would consist of a conjunction of J applied to only those expressions among §,,.-.-,8, 
that are of the representation type. The same qualification applies to condition (4), as well. 
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conditions. The Totality Condition: and the Uniqueness Condition remain as before, and 
serve the same purpose: The Totality Condition ensures that an implementing function is 
defined and terminates on every value in the representing domain. | The Uniqueness 
Condition ensures that an implementing function yields a unique value on every argument. 
| The Homomorphism Condition, which ensures that every implementing function satisfies the 
; homomorphism property, now requires that J6(F(g,,..-,8,)) = 36(0 be a theorem only 
under the assumption that the arguments to F satisfy 3. The Invariance Condition imposes an 
additiona! constraint on the expression that may appear on the right hand side of a rule: It 
ensures that every implementing function preserves 5. The Synthesis Theorem to follow 
shows that when PI satisfies all the synthesis conditions PI U IW and PI U PW satisfy the 
principle of definition with respect to T. 


The Synthesis Theorem 


Theorem 2 Let Pl be a set of rewrite rules that satisfies all the synthesis conditions. Then, 
PIU IW and Pi U PW satisfy the principle of definition with respect to T, where T is the 


representing domain of constants characterized by the invariant J. 


Proof Appendix III 


5.2.2 Deriving the rules of PI 


The derivation PI follows the same general pattern as before. The first task is to 
construct the PW which is done as before by combining the specification of the implemented 
type, the homomorphism specification, and any desired parts of the specifications of the 
implementing types. The homomorphism specification is derived from the abstraction 
function specification as before (sec. 4.2.2). For instance, PW for the example under 
consideration is given in Fig. 16. Note that PW does not contain the invariant specification. 
The infomation pertaining to the invariant will be maintained as a different entity. This will 
be explained shortly. 


The rules of PI are derived so that every synthesis condition except the Uniqueness 
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: Fig. 16. The Perturbed World. 


_ (1) Front(Nuliq) - ERROR 
‘ (2) Front(Enqueue(Nullq, e)) > e 
~ (3) Front(Enqueue(Enqueue(q, e1), e2)) + Front(Enqueue(q, e1)) 


(4) Dequeue(Nullq) ~ ERROR 
. (5) Dequeuc(Enqueue(Nullg, e)) — Nullq 
- (6) Dequeue(Enqueue{Enqueue(q, el), c2)) + Enqueue(Dequeue(Enqueue(q, e1)), e2) 


, (10) Append(q, Nullq) — q : : 
(11) Append(q1, Enqueue(q2, ¢2)) + Enqueue(Append(ql, q2), e2) 


(12) Empty(Nullq) — True 
(13) Empty(Enqueue(q, c)) — False 


(14) 16(<y, i, >) + Nullg . 
(15) 36(< Assign(y, e, j), i, j+1>) — if i = j+1 then Nullq 
else Enqueue(36(<y, i, j>), J€(e)) 


(16) I6(NULLQ()) — Nullg 

(17) H(ENQUEUK(e, i)) + Enqueue(36(c), 36(i)) 
(18) 36(DEQUEUE(c)) — Dequeue(J6(c)) 

(19) 36(APPENDX(c1, c2)) — Append(36(c1), 36(c2)) 
(20) 4(EMPTY(c)) — Fmpty(36(c)) 


(21) 36(if_then_else(b, v,, v,)) — if_then_else(b, J6(v,), J6(v,)) 


Condition is met. The procedure derives the preliminary implementation for one operation 
at a time by deriving a separate set of rewrite rules for every operation. The method used is 
the same for every operation. The procedure first determines the left hand sides of all the 
rules to derive a partial preliminary implementation. Then, it determines a suitable right 


hand side for each of the rules in the partial preliminary implementation. 


- 102- 


+ §.2.2.1 Determining the Left Hand Side 


The technique used for determining the left hand sides is the same as before 
‘ because the Totality Condition, which is used for the purpose, is the same as before. The left 
* hand sides are derived so that the set of expressions appearing as arguments to every 
- implementing function is well-spanned.?! Fig. 17 gives a possible set of left hand sides for a 
preliminary implementation for the example under consideration. As before, we use the . 


question mark identifiers as place holders for expressions to be determined yet. 


Fig. 17. A Partial Preliminary Implementation 


Representation 


Array_Int X Integer X Integer 


Definitions 
NULLQOQ — ?rhs, 


ENQUEUE(¢y, i, >, ¢) + ?rhs, 


FRONT(<y, i, >) — ?rhs, 
FRONT(<Assign(y, ¢, j), i, j+1>) — ?rhs, 


DEQUEUE(y,, i, >) + rhs, | 
DEQUEUE(<Assign(,, , i), i, j+1>) — ?rhs, 


APPEND(CV1, il, j1>, <v2, i2, 12>) + Prhs, . 
APPEND(<v1, i1, j1>, <Assign(v2, e, j2), i2, j2 + 1>)—+ ?rhs, 


EMPTY(y, i, >) —+ 2rhs, 


21. Note that if a set is well-spanned, then it is well-spanned with respect to any set of generator 
constants, 
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- 5.2.2.2 Determining the Right Hand Side 


The general strategy used to derive the right hand sides is the same as before. They 
"are derived so that the Homomorphism Condition, the Invariance Condition, and the second 
part of the Totality Condition (which is left unensured while determining the left hand side) 

are ensured. The right hand side of a rule is determined by deriving a. synthesis equation 
| corresponding to the rule. A synthesis equation corresponding to a rule F(g,,..-, 8.) -+ ?t is 
" an equation of the form 36(F(g,,---,8,)) = J6(qt) that satisfies the following conditions: 


(1) 4(g,) N..A 4J(g,) => J6(F(g,, Sues g,)) = J6(?t) is a theorem of PW. 


(2) Ifthe range type of F is the representation type, then 
Hg) A... A Mg.) => §(t) = True is a theorem of PW. 


(3) F(g,,---> a) > ?t, > is the termination ordering on expressions, 


(4) ?t may only contain only the permitted operation symbols of the implementing 


types, and the implementing function symbols. 


Note that the synthesis equations have additional constraints here because of J. So, 
the derivation of the synthesis equations is going is going to have to be performed slightly 
differently. This is the topic of the next section. 


5.3 Deriving the Synthesis Equations 


The general strategy used for deriving a synthesis equation is the same as before. 
That is, we generate a series of theorems of PW until we encounter one that qualifies to be a 
synthesis equation. We use the same pair of synthesis rules for generating the theorems of 
PW. The only difference lies in the set of rewrite rules used for expansion while generating 
the theorems. Earlier, the rewrite rules in PW were used. But now, it is necessary to use an 
additionalset of rewrite rules. 

There are two reasons for this. Firstly, a synthesis equation 
HF, webis ey) = 36(?t) to be derived is a theorem of PW in a special context: A context 
determined by the fact that £,+++8, Satisfy the invariant J. In deriving the synthesis 
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* equations, one has to use rewrite rules describing this context besides the rewrite rules in PW. 

- Secondly, ?t has to be determined so that 3(?t) = True is a theorem. For this, it is necessary 

to use the rewrite rules in the specification of 3. These additional rewrite rules, which 
describe information pertaining to the invariant, are maintained as a separate entity called the 
Temporary World (TW). We will discuss more about TW - its composition, and its 
construction - later. It is sufficient to say the following at this point: TW consists of rules that 
specify J, and rules that assert that Speer B, satisfy the invariant. The rules in TW are used 
for expansion as well as to ensure that 2 satisfies 4. . 

It should be noted that part of the Temporary World used in the derivation of a 
preliminary implementation could be different for different rules in the preliminary 
implementation. This is because the argument expressions appearing on the left hand side 
(g)-- 58) are usually different for different rules, Consequently, the part of TW that 
changes has to be constructed afresh at the beginning of the derivation of every rule. (The 


temporary life time of a part of TW is what prompted us to name TW a Temporary World.) 
5.3.1 A Simple Illustration 


In the following, we show the derivation of a synthesis equation corresponding to 
the rewrite rule ENQUEUE(<,, i, >, 2) rhs, in the partial preliminary implementation 
shown in Fig. 17; The derivation provides an illustration of how the generation of theorems 
is influenced by TW. It also illustrates for the first time performing expansion using rewrite 
tules that have conditional expressions in them. | 

The TW used for the derivation is shown below. For ease of reference, also given 
below are rules excerpted from PW (Fig. 16) that are relevant in the present derivation. 
Rules numbered (9) and (10) in TW are the specification of 3. The rule numbered (11) asserts 
that the argument <v, i > to ENQUEUE satisfies 3. The fourth rule is a property of the 
invariant: Any triple <y, i, > that satisfies J is such that i <j. This can be proved as a theorem 
from the specification of J. We will see how this is obtained in a subsequent section where we 


discuss more about the Temporary World. 


The Relevant Rules of PW 


- 105 - 


» (1) HKy, i, Dd} — Null : 
(2) HK Assign(y, ¢, j), i, j+1>) > ifi = i+ 1 then Nullq 
else Enqueue(6(<y, i, >), 3G6(e)) 


’ (3) 6(ENQUEUE(x, e)) > Enqueue(36(x), 36(e)) 


(4) if_then_else({False, v1, 2) — v2 
(5) if_then_else(True, vl, v2) — vl 
(6) J6(if_then_else(b, v1, v2)) — if_then_else(b, 36(v1), 36(v2)) 


(7)x = y+1— not(x < y) 
(8) not(True) — False 


The Temporary World 


(9) 5(<v, i, >) True 
(10) 5K Assign(y, ¢, j), ,j+1) + 1<j41 A fi = j+1 V KV, LP) 


(11) I< y, § >) — True 
(12) i< j— True 


Shown below is a generation of a series of theorems by invoking the synthesis rules 
using the rewrite rules shown above for expansion. The generation results in the derivation 
of a synthesis equation of the form we desire. The first theorem in the series is obtained by 

‘invoking Synthesis Rule (1) for the expression 16(ENQUEUE{<,y, i, j>, 2); the normal form 
of this expression is Enqueue(36(<y, i, />), 6(e)). The rest of the theorems in the series are 
obtained by invoking Synthesis Rule (2) using different rules in PW and TW for expansion. 

An explanation about our choice of the rewrite rules for expansion in the following 
derivation is in order. Recall that the ultimate objective of expansion is to drive the symbol 
36 in the right hand side of the equation in Step (1) to the outermost level of the expression. 
Inspection of the rules of PW reveals two possible sets of rules which could be used for this 
purpose. The first one is the 36-rules, in particular, Rule (3) of PW; however, applying this 
tule in Step (1) will yield an expression identical to the one on the left hand side which is not 
acceptable. The other possibility is applying the rules of the homomorphism specification, 
ie., either Rule (1) or (2) of PW. Rule(1) is clearly not applicable. Rule (2) is also not 
applicable. A closer look, however, reveals that Enqueue(36(<v, i, />), 36(e)) has the form of 


the expression in the else-arm of the conditional expression on the right hand side of 
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. Rule (2). Hence, we make an attempt to expand Enqueue(36(<y, >), 36(@) to an expression 
of the form if_then_else(..., ..., Enqueue(36(<¥, i, >), 36(e))). The manipulations performed in 
. Steps (2) through (4) are precisely aimed at this. 


Form of synthesis equation to be derived: J(ENQUEUE(K», i, >, e)) 
Normal form of J36(ENQUEUE<y, i, >, 2): Enqueue(36(< >, i, >), 16(2) 
Rules used for simplification: 


Step (1) Invoke Synthesis Rule (1) on JE(ENQUEUE(K»y, i, Pp, e)) 
JE(ENQUEUE(<y, &, >, €)) = Enqueue(J6(<», 4 />), 16(2) 


Step (2) Expand Expression: Enqueue(36(< v, i, j>), 36(e)) 
Using Rule: (4) 


36(ENQUEUE(K v, i, j>, €)) = if False then v1 else Enquene(36(< v, ¢, >), J6(e)) 


Step (3) Expand Expression: False 
Using Rule: (8) 


nme nee meee ences sencocre reese easeseesssernenesnencescesaneseeesnense 


36(ENQUEUEK », ¢, >, €)) = if ~(True) then y1 else Enqueue(36(<», i, j>), F6(e)) 


Step (4) Expand Expression: True 
Using Rule: (12) 


Rw eee ee meee rene Se Se eeS ener nee Ben esens Se ener ESseEessemnaeeseneneees 


J6(ENQUEUEK », ¢, >, e)) = if not(i < /) then v1 else Enqueue(36(< v, i, J>), 46(0) 


Step (5) Expand Expression: ~(i <j) 
_ Using Rule: (7) 


PB Ons Meme eens Bees wre BESS EEe SNOOP SEES EER SERRE ESEASeeSeEEEHETENEEE 


J6(ENQUEUEK y, , >, ©) = if i = j+1 then v1 else Enqueue(316(<v, i, >), IG(e)) 


Step (6) Expand Expression: if i = j+1 then vl else Enqueue(36(<v, i, >), 36(2) 
Using Rule: (2) 


AR Pere S BOS RSens Sn EN SS eee ese REeSSErs Sr eTSsESeU NS nReRANEDE EEE E EEN 


JE(ENQUEUEKK », ¢, >, &)) = H(KAssign(y, e, ), § j+ 1) 
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“ Note that the right hand side of the last theorem in the above series is 
such that 


ENQUEUE(K<), i, >, &)} > <Assign(y, e, /), i, j+1> 
5(< Assign(y, e, j), 5 j+1>) +* True 


* Hence, we have the following preliminary implementation for ENQUEUE: 
ENQUEUE(<), 4 >, > <Assign(y, e, ), i j+1> 


‘Let us, for a moment, draw the attention of the reader back to steps (2) through (4) 
in the above derivation. Their aim was mereciy to expand Enqueue(36(<», i, >), 16(e)) to a 
conditional expression that had the former expression as its else-arm. The purpose of such a 
‘transformation was to make it possible to apply (for expanding) a rewrite rule that had a 
conditional expression on the right hand side. A situation such as this is encountered 
commonly during the generation of theorems. This is especially so when the rules of the 
input specifications have conditional expressions in them. Hence it is useful to extend the 
definition of the mechanism expand so that rewrite rules with conditional expressions on their 
right hand side can be applied directly to an expression that is not a conditional expression. 
We describe the extension below. In future illustrations of the derivation of synthesis 
equations, we will be using the extended version of expand. 


Suppose e, — if_then_else(b, e e,,) is a rewrite rule, and a is an expression that is 


2 
being expanded by using the former rule. According to the existing definition of expand, the 
following protocol is used for expanding a: 


Protocol I: 


(1) Check if a (or a subexpression in it) is unifiable with if_then_else(b, e,,. e,,); if so, 
let 6 be the most general unifier. 


(2) Replace 6(a) (or the subexpression in it) by oe,) 


Note that according to the above protocol « is expandible only if « (or a subexpression in it) 
is of the form if_then_else(...). Now, we introduce two additional ways in which the rule can 
be used for expansion. 


Protocol 2: 
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(1) Check if « (or a subexpression in it) is unifiable with e.,; if so; let @ be the most 


vt 
general unifier. 


(2) Check if 6(b) +* True, or ~(6(b)) * False. 


(3) Ifso, replace @(a) (or a subexpression in it) by @(e,). 


Protocol 3: 


(1) Check if (or a subexpression in it) is unifiable with e,,; if so, let @ be the most 


22 
general unifier. 


(2) Check if 6(b) -+* False, or ~(6(b)) +* True. 
(3) Ifso, replace 4(«) (or a subexpression in it) by a(e,). 


Using Protocol 3, the preliminary implementation of Enqueue derived earlier can be 
obtained in just two steps as shown below. The theorem in step (1) is obtained as before. The 
theorem in the second step is obtained by using Rule(2) of PW for expansion under 
protocol (3). Note that the boolean expression under consideration is i= j+1,; 
i = j+1—-+* False by Rules (7), (12) and (8). 


Form of synthesis equation to be derived: JE(ENQUEUEK», 4, D, 2) 
Normal form of 4(ENQUEUEK y, i, >, e)): Enquene(J6(<v, i, >), 36(e)) 
Rules used for simplification: 


Step (1) Invoke Synthesis Rule (1) on H(ENQUEUEKK »y, i, >, €)) 
IE(ENQUEUEK », ¢, >, e)) = Enqueue(J6(<), i, >), 36(e)) 


Step (2) Expand: Occurrence: A 
Expression: Enqueue(36(<v, i, j>), J6(e)) 
Using Rule: (2), Protocol 3 


OOO We OR RSE REESE EERE RES aE EE EEN OLeDeReREeSEEeELSESteE ELSES SHES EEEES 


JE(ENQUEUEKK y, i, >, 2) = HK Assign(y, e, J), 5 f+ 1D) 


It should be pointed that the addition of protocols (2) and (3) does not enhance the 
generality of the original definition of expand. In other words, we can show the following: 
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: Suppose f can be obtained from a:in a finite number of expansion steps using a rewriting 


oe 


system R under protocols (1), (2) and (3). Then, B can also be obtained from a in a finite 
number of expansion steps using only protocol (1), provided R contains the following rules 
that specify if_then_else: 

if_then_else(True, v,, ¥,) — V, 

if_then_else(False, v,, ¥,) > V, 

The reason for introducing protocols(2) and (3) is to reduce the number of 
expansion steps needed in the generation of theorems. The two rules of if_then_else given 
above make expansion uneconomical because the right hand side of each of them is a 


variable. This makes each of them a candidate for being used for expansion at every step of 


: the theorem generation process. Use of protocols (2) and (3) in effect limits the use of the 


above two rules to cases where there is a rewrite rule with an if_then_else in its right hand 


side, and which could be used for further expansion. 
5.3.2 More on the Temporary World 
5.3.2.1 The Purpose of TW 


The Temporary World (TW) serves two purposes: Firstly, it holds information 


about the invariant J. Secondly, it provides a means of keeping a log of certain assertions that 


‘are needed for temporary stretches during the course of the derivation of an preliminary 


implementation. Some of these assertions are generated automatically by the procedure; 
others are supplied by the user. 

The information about J and the assertions are entered into TW as rewrite rules. 
(The derivation procedure may use the rules in TW for expansion like the rules of PW, the 
Perturbed World.) The assertions needed may change during the course of the derivation of a 
preliminary implementation. Some of the assertions needed can only be determined during 


the course of the derivation. Because of these reasons, TW is treated as a dynamic world, i.e., 


_aworld that changes during the course of the derivation of a preliminary implementation. In 


contrast, PW keeps a log of the facts needed through the derivation of the entire preliminary 


implementation. 
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There are three reasons why temporary assertions might be needed during the 
_ derivation. Firstly, the equation 36(F(g,,...,g,)) = 36(?t) being searched for is a theorem of © 
PW only under the hypothesis that the arguments to F satisfy J. The second reason arises in 
checking if ?rhs satisfies J, i.e., if j(?rhs) = True is a theorem. This check has to be 
_ performed under the hypothesis that the arguments to F satisfy J. Also, performing this 
check may need the use of the inductive logic. In such a case, it is necessary to set up 
appropriate hypotheses for the induction. 

The third reason for the need for assertions arises while one is attempting to expand 
a subexpression of a conditional expression if_then_else(b, e,,e,). Under such a situation, we 
may assume that b is False while expanding a subexpression in the else-arm, or that b is True 
while expanding a subexpression in the then-arm. For instance, consider the expression 


if_then_else(i= j+1, e,, Enqueue(36(<yy, i, >), 36(e,))). In this case, the subexpression 


? 
Enqueue(‘6(<y, i, j>), 36(e,)) is expandible by the rewrite rule 
46(<Assign(yy, e, j), i, j+1>) — if i = j+1 then Nullq else Enqueue(36(<v, i, j>), 36(e)) 


only if we make the hypothesis that i = j+ 1 +* False. 
5.3.2.2 Construction of TW 


TW consists of two parts: A static part, and a dynamic part. The static part remains 
unchanged for the entire duration of the derivation .of the preliminary implementation. The 


dynamic part may change during the derivation. 
5.3.2.2.1 The Static Part 


The static part consists of information about the invariant J. It consists of 


() A set of rewrite rules that constitute the specification of 5. The specification of 3 
involves other data types which are among the implementing types. We assume that 
the static part contains their specifications also. In the examples we discuss, only the 


relevant rules from these specifications are displayed. 


4 
a 
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(2) A set of rewrite rules that express additional properties about 3. 


The rewrite rules mentioned in (1), above, can be constructed automatically from 


the association specification, The information in (2) is something the user has the option of 


supplying additionally for deriving a preliminary implementation in the presence of a 
nontrivial invariant. This information is needed for the following reason: There are several 


preliminary implementations whose derivation is dependent on lemmas that express 


_ interesting properties about the invariant. Although it might be possible to prove these 


lemmas from the specification of J, the derivation procedure cannot automatically discover 


_ the desired lemma. The rewrite rules in (2) specify these lemmas. 


The static part of TW used for the current example is given below. Rules (1) and (2) 
are constructed from the specification of J given as part of the association specification in 
Fig. 15. Notice that the right hand side of rule (2) is a simplified version of the right hand 
side of the corresponding equation of the specification.of J. The rules used in the 


simplification are (10), (11), (8), and (4). Rule (3) specifies a property of 3. It asserts that if a 


- triple <v, i, j> satisfies J, then i<j. The property can be proved from the specification of § 


using the KB-method. Rules (4) through (11) belong to the specification Integer and Bool. 


These rules will be used in the examples that follow. 


. (1) IKv, i, >) + True 


(2) S(<Assign(y, ¢, j), J+) +i <J+1A [i = j41 V IKViD) 
(3) 5(<v, i, >) => i <j— True 
(4)x=yVx<sy7x<sy 


(5) True V x — True 

(6) ~x V x > True 

(7) ~(x Ay) > ~x V ~y 
8)xVy¥AzaKkVyYAKVA 
(9) (x A y) => y > True 

(10) if_then_else(b, True, e,) 4bV e, 
(11) if_then_else(b, e,, False) + b A e, 


=1i2 = 


, 5.3.2.2.2 The Dynamic part | 


This is the part that may change during the course of the derivation of a preliminary 

| implementation. It may vary from the derivation of one rule of the preliminary 
implementation to another; within the derivation of a single rule, it may vary from one 
theorem generation step to the next. By a theorem generation step, we mean the following: 
Recall that the derivation of a rule involves generating a series of theorems. The generation 
of every theorem in the series is considered as a theorem generation step in the derivation of 
the rule, 

The dynamic part is empty at the beginning of the derivation of every rule of the 
implementation derfinition. Assertions (in the form of rewrite rules) are added to and 
removed from the dynamic part at specific instants during the derivation of a rule, Every 
assertion that is added during the derivation of a rule is removed by the end of the derivation. 
Every time an assertion is added to TW, it is important to ascertain that the addition does not 
render TW inconsistent. To ensure consistency, we run_ the predicate 
Is-an-inductive-theorem-of2~ (see sec.4.4.3.2) on TW every time an assertion is added to TW. 
(Note that TW is convergent to begin with. This is because the static part, which consists of 
the specification of J, is guaranteed to be convergent.) The assertion is added only if the 
Is-an-inductive-theorem-of succeeds. In some cases the is-an-inductive-theorem-of may 
succeed by generating a finite number of new assertions. In several situations it is useful to 
add these new assertions also to TW. If these assertions are, indeed, added to TW, then they 
should also be removed along with the original assertion. 

The assertions in the dynamic part can be classified into two categories based on the 


life time of their existence. We describe the construction of the two categories below. 


Arguments-Assertions 


These assertions are added at the beginning of the derivation of a rule. They remain 


22. We assume that the predicate Is-an-inductive-theorem-of is run iteratively a fixed number of 
times that is finite. 


13> 


‘in TW until the end of the derivation of the rule. We call these assertions 


- Arguments-Assertions because they are dependent on the expressions supplied as arguments 


; to the implementing function for which the rule is being derived. For instance, if the rule 


'. being derived is of the form F(g,,- 3 8) — ?t, then the assertions are dependent on 


Speer 8, 


Arguments-Assertions can be of two kinds: The first kind assert that g,,....8, 


: Satisfy J. These are entered in TW as the rewrite rules 5(g,) + True,..., i(g,) — True. It is 


’ easy to see that these assertions can be constructed automatically. 


The second kind consist of assertions that are supplied by the user. These are used 
for ensuring that every rule of the preliminary implementation preserves the invariant J, ie., 
5(g,)A...AKg,) => 1F(g,,---,8,)). The assertions express the induction hypotheses that 
might be needed for checking the above property. The reason that the user might have to 
supply these assertion is the following. Recall that our method ensures the invariance 
property by deriving every tule F(g,,.--,8,) — ?t so that 5(?rhs) = True is a theorem of TW. 
(Note that TW already includes rewrite rules asserting that By--- 8, satisfy J.) If the 
preliminary implementation desired is such that 3(?t) = True can be proved automatically 
from TW using the equational logic or the KB-method for proving inductive properties, then 
no additional assertions are needed. However, if the preliminary implementation desired is 


such that the proof of 3(?rhs) = True needs induction hypotheses that cannot be generated 


rf automatically by the KB-method, then assertions expressing the induction hypotheses have to 


be added to TW. 

. The assertions used as induction hypotheses in all our examples are constructed by 
invoking the inference rule given below. The inference rule expresses a general induction 
principle that uses the termination ordering > as the well-founded partial ordering for the 
induction. Informally, the inference rule can be stated as follows. Suppose F(g,, bas g,) 7 
?t is the rule being derived. Then, in trying to ensure ‘(F(g,, sai »8,)), we may assume 
5(F(v,,...,¥,)) for any argument <v,,...,¥,> that satisfies J, and that is “less than" 
<B,5-++»B,> in the ordering >. 


Bp. 2258 > We. > 
Iv, )A...AI(¥,) => HE(V,,--- 5 ¥)) 
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» As an illustration, let us construct a set of Arguments-Assertions for the derivation of a rule 
- for APPEND. We will be using these assertions later when we illustrate the derivation of the 
» preliminary implementation for APREND. Suppose we are attempting to derive a rule of the 
- following form: ; 
| APPEND({K\,, i,, j,>» <Assign(y,, 6, j,), iy i,t 1>) — ?rhs | 
Then, the Arguments-Assertions may include the following rewrite rules. The first two 
assertions state that the arguments supplied to APPEND satisfy J. The third assertion is used 
as an induction hypothesis, ° 

5(Xv,, i,j) 9 True 

I(<Assign(y,, @, j,), i, j,+1>) + True 


5(<v,, i j,>) = HAPPEND(<),, i,, >, <¥p iy j,>)) + True 


Conditional-Expressions-Assertions 
The second category of assertions in the dynamic part’ is _ the 
-Conditional-Expressions-Assertions. A need for these assertions arises while expanding a 
subexpression of a conditional expression in the generation of theorems. These assertions are 
added to TW at the beginning of a theorem generation step, and removed at the end of the 
step. The Conditional-Expressions-Assertions needed in a step are determined by the 
occurrence of the subexpression that is chosen to be expanded for generating the theorem in 
that step. For instance, suppose the following is the theorem generated in the first step 
during the derivation of a rule for APPEND. 
He(APPEND(<»,, i, /,>» <Assigny,, & j,), iy J, + 1>)) 

= if.then_else(i, = j, + 1, Enqueue(6(APPEND(CY,, {1 /,>s <p ip 2) 9) 
Suppose we decide to generate the theorem in step(2) by expanding the subexpression 
J(APPEND(C¥,, i, Js <¥y iy j,>)) On the right hand side of the theorem in step (1). Then, 
we may add to TW the assertion i, = j,+1-+ False. The reasoning behind the addition of 
this assertion should be apparent by now. The subexpression chosen for expansion appears 
in the else-arm of a conditional expression. Hence, while expanding the subexpression we 


may (if we wish) assume that the corresponding boolean expression is False. In general, we 
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‘ may have to add more than one such assertion in a step because the subexpression could be 
_ embedded within more than one conditional expression. Suppose a is the subexpression 
: chosen to be expanded. Then, the Conditional-Expressions-Assertions for the step are 


- determined as follows: 


(i) For every conditional expression if_then_else(b,, waeQlnsy «.), Of Whose then-arm a is a 


part, add b, — True. 


(ii) For every conditional expression if_then_else(b,, ..., ...«...), of whose else-arm a is a 


part, add b, —+ False. 


5.3.3 Preliminary Implementation of Append 


This section derives a pair of synthesis equations corresponding to the two rewrite 
Tules in the partial preliminary implementation that define APPEND. It illustrates a more 
interesting utilization of the invariant j than was seen in the derivation of the rule for 
ENQUEUE. The derivation also demonstrates how a where construct can be introduced into 
a preliminary implementation, and why it is useful to do so. 

Recall the reason for introducing the where construct into the preliminary 
implementation language: To alleviate the limitation of the constraint that a preliminary 
implementation may not contain any helping functions or observers of the representation 
type. The constraint, in particular, makes it impossible to select the components of a tuple 
returned by an expression that appears on the right hand side of a rule. 

For instance, suppose we wish to construct a triple using the components of the 
triple returned by APPEND({<v,, is jy» iy iy J). A where construct permits us to do this 
by rewriting the above expression in the following fashion. 

<y, i, > where <y, i, j> is APPEND(<¥,, ,, j,>. “Vp by j,>) 

Then, the first argument can be further transformed to construct the desired triple. For 
instance, - 

<Assign(y, e» J), 5 j+D where <), i, > is APPEND({<y,, i522 Vy by jp) 

The new terminals y, i, j introduced should be distinct from the terminals that 
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«already exist in the expression that is being transformed. It should be noted that a where 
~ construct can always be eliminated from an expression provided we are permitted to use the | 
. selector operations of the tuple type: This elimination can also be performed automatically. 
| For instance, the where construct in the above expressions can be eliminated by systematically 
. Teplacing every occurrence of y, i, and j in the first argument to the where construct by the 
. following expressions: First(APPEND(¢y,, i,, /,>, <¥p iy J,)), Second(APPEND(<),, i, /,>; 
<¥y iy j,>)), Third(APPEND(<y,, i, js <¥» i J>)). (First, Second, and Third are 
operations that select the first, second, and third components of a triple.) 
Below, we give two rules concerning a where construct. The rules can be used at any 
step during the generation of theorems to transform the expression on the right hand side of a 
theorem. The first rule specifies how a where construct can be introduced into an expression. 
The second rule specifies how the position of where can be moved within an expression 


without altering its semantics. Suppose 
(1) F is an implementing function whose range is a triple type, 
(2) g is an arbitrary function, — 
QG) e, e,---»&, are arbitrary expressions, 


(4) », i, jare terminals that do not appear in the equation e = g(..., F(e,,..., C,) sone) 


Where-Rule (1) 
e = gf... Fe,,..., €,) see) 
e = gf..., <v, 5 > where <y, i, > is Fe,, woo 9 @) gan) 
Where-Rule (2) 


e = g(..., <», i, > where <y, bP 5 F(ey..-1€)-) 
e = gf... <v, i, J> ,.) where <v, i, D is F(e,, ..., &) 

_ A few remarks are in order at this point regarding expanding an expression that 
appears as a subexpression of a where construct. Firstly, an instance of a where construct is 
treated, for syntactic purposes, as an application of a function Where_is with three arguments. 

For instance, <Assign(, ¢,, /), i, /+ 1> where <v, i, > is APPEND(KY,, i,, j,9, <Y¥p iy i>) is 


ap He 


treated as the expression Where_is(<Assign(y, Cy DEIFD, <y, & P, APPEND(<»,, is J» 
<v,, i, j,)). Secondly, the second argument to Where_is may not be expanded; only, the 
. first and the third may be expanded. In the above example, for instance, <v, i, j» may not be 
expanded. This is because the second argument to Where_is has to be a tuple of terminals (or 
variables). It does not make sense to have a nonterminal expression as a part of the second 
argument; expansion will introduce a nonterminal expression, 

The third remark concerns the possibility of making temporary assertions while 
expanding subexpressions Of a where expression. Consider the example given above. 
Suppose we decide on expanding the expression <Assign(, ey J), i, j+1>. The terminals », i, 
and j in this expression are such that <y, i, > is acting as a place holder for APPEND(<y,, i,, 
J, Vy iy j,>). If APPEND({<),, bs Js Vy by J,>) happens to be such that 5(APPEND({<vv,, 
iis Js Vy by i) is True, then we may assume that 3(<¥, i, >) is also True as long we are 
: expanding the first argument to the where expression. This assumption may, in general, 
. enhance the possibility for expansion. Thus, expansion of a subexpression of a where 

expression may result in an update of the Temporary World (TW). For instance, in the above 
example, if we J}APPEND(CY,, i, j,>, <Vy , 
update TW with the assertion 5(<», i,j) + True. This is used in the derivation to follow. 


‘ jp) = True is a theorem of TW, then we may 


ArgsSet = { Argl: <<v,, i, 1,9, <¥» i, i>? 
Arg2: Cv, 1,4, <Assign(v,, ey J) ind,+ 1>> } 


Relevant Excerpts of the Perturbed World 


(1) 16(K<y, i, >) + Nullg 
(2) 16K Assign(y, e, j), i, j4+-1>) if_then_else(i = j+1, Null, Enqueue(36(<y, i, >), 16(e))) 


(3) '36(APPEND(x, y)) > Append(36(x), 6(y)) 
(4) 36(if_then_else(b, v,, v,)) — if_then_else(b, 36(v,), 36(v,)) 


Derivation of the rule corresponding to Arg! 


Form of the theorem to be generated: 36(APPENIX< V0 by IPs Vy by i>)) = JG(?rhs,) 
J(APPEND(K Vis Es Fs <Myy by i>) : KCK Vhs Jj?) 
Rules used for simplification: 
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Initial State of the Temporary World 


5(<v, i, >) + True 
~ 5(<Assign(y, ¢, j), i, j+1>) 9 iS j+1A fi =j4+1V JK, i, p)] 


4(y, i, >) => i<j— True 


I(K<y,, i, j,>) + True 


hd 4 


Step (1) Invoke Synthesis Rule (1) on J6(APPEND({< v 


1 


I6(APPENDKC»,, i,,4,>, <¥y ip £2) = 16(KY, >) 


’ is js <vy bs i>) 


APPEND(C¥,, 1,2 pp by 2) 4 Vp iy > 


Derivation of the rule corresponding to Arg2 
Form of the theorem to be generated: JG(APPEND(< Vhs As <Assign(y,, ¢,, J): i 4,+ 1>)) = K(?rhs,) 
36(APPEND(K Vo bys <Assign( Vy» Gy J,)s boht 1>))s: 
if_then_else(i, = j, +1, J6(< Vo b>) 
_Enqueue(Append(J6(<v,, 1, >), J6(<v,, i, 4,2), J6(e,))) 
Rules used for simplification: 


Initial State of the Temporary World 


Static Part 

(5) 5(<v, i, >) + True . 
(6) 5(<Assign(yy, e, j), i, j+1>) -i<j+1 Afi =j+1 V I«Kv,i, p)} 
(7) [5Kvy, i, >) => i<j] True 


Argumenits- Assertions 

(8) 5(<v,, i, j,>) + True 

(9) 5(<Assign(,, ey J), id, t 1>) - True 
("The following is as a consequence of Rule (9)") 
(9a), SALA LL = 441 V Ky, i, 5>)] + Troe 


(10) I(Ky,, i, ,>) => HAPPEND({<y,, i, >, <V,» i, 4,2)) + True 
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- Step (1) Invoke Synthesis Rule (1) on the expression 36(APPEND(< Vs ts Hs SV ashy i>) 
IG(APPENI(CY,, ,5,>s <¥y iy >) = 
if_then_else(i, = j, +1, 4», 1./>), 
'  Enqueue(Append(36(Xv,, i,,,/,>), (Kv, i, 7,2) I6(e))) 


~ Step (2) Expand: Occurrence: 3.1 
: Expression: Append(36(< Vs 44>), JO, 1, 7,>)) 
Using Rule: (3) . 
36(APPEND{< Vy Ao J,>» SAssign(y, €, J), oJ, +1) = 
if_then_else(i, = j, +1, 10(<v,, 1, /,>), 
: Enqueue(J6(APPEND(< v,, i, /,>s <4» 4, i”), J6(e,))) 


? 2 


Step (3) Transform: Occurrence: 3.1.1 
Expression: APPEND(C¥,, i, /,>, Vp by J.) 


Using Rule: where-rule (1) 


JG(APPEND(< y,, i, j,>, <Assign(y,, e,, j,), iLDiAtD) = 
if_then_clse( i= fth JK, i, j,>), Enqueuc(36(<y, 4, >), Je.) 


where Cy, i, > is APPEND(<»,, i, j.>, <vy i 4,2) 


Step (4) Expand: Occurrence: 3 
Expression: Enqueue(J6(<v, & j>), J6(e,)) 
Using Rule: 
TW Update: 
Added because expression is in scope of else-arm 
i, = j,+1 False 
i SAt1A IK, i, j,>) 4 Troe 
5(<v,, i, j,>) + True 
5(APPENIXC Vir bs APs < Mp by jp)) — True 
Added because expression is in scope of where 
SK, i, >) + True 
i<jTrue 
JE(APPENDC y,, i, j,>, CAssign( Vy ys i) LA+D) = 
if_then_else(i, = j, +1, (Cv, 1, >), Jo(CAssign(», e, 9, 4 j+1>) 
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where <v, i, > is APPEND(<¥,, 4, 5s <p i>) 


Step (5) Transform: Occurrence: 
; Using Rule: where-rule (2) 


see cee Bees esses e ees e san ease sane ee SER esEE DE Reneten eee neenesassense 


36(APPEND({< Vio bs is <Assign( Vas Es js ae 1>)) = 
if_then_else(i, = jt 1, 1K Ys iy i), 16(<Assign(y, ey Di j+))) 


where <v, i, > is APPEND({K Vis Lehr Vy by i) 


? 2 


- Step(6) Expand: Occurrence: A 
Using Rule: (4) 
16(APPENDKC¥,, 1, ,>, <Assign(v,, ¢,, 4), i,j, +12)) = 
JG(if_then_else(i, = j, +1, <v,, i, j,>, <Assign(y, €,)), 4 j+ 1>)) 


where Cv, i, > is APPEND(C¥,, i, j,>s <p ip.) 


2 


APPEND(<,,, i, j,>, <Assign(»,, 4). £.j,+1>) > 
if i, = f+] then <y, i, j> 


vu 
else <Assign(y, ey Dd, i. j+1> where <y, i, > is APPEND(K Vis hy dys SVs by Jj) 
Definition of APPEND 
APPEND(K), 1, 5,, <Vp iy 1D) 7 <p 14> 


APPENDKK v,, i, j>, <ASSign(”,, &,J,), i,j, 1) 4 
if Lb=itl then <¥5 id, 
else <Assign(y, ed, 4 j+1> where <v, i, j> is APPEND({K« Vis bs As Spy dy i) 


22 
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16. Stage 2: The Target Implementation 


The second stage of the synthesis procedure transforms the preliminary 
, implementation of the implemented type into a target implementation. For instance, in the 
_ example implementing Queue_Int in terms of Circ_List, the preliminary implementation 
derived in the last chapter (shown Fig.5 of chapter 2) is transformed into a target 
implementation such as the one shown in Fig. 0. 
There are two differences between a preliminary implementation and a target 
_ implementation. The first one is that in a preliminary implementation the only operations of 
the representation type allowed to appear are the generators of the type. The target 
implementation may also contain nongenerators of the type. The second difference is in the 
function definition methods used by the two forms of implementation. In a preliminary 
implementation a function is defined by means of a set of rewrite rules. For example the 


preliminary implementation of ENQUEUE (Fig. 5) is: 


ENQUEUK(Create, j) — Insert(Create, j) 
ENQUEUK(Insert(c, i), j) + Insert(ENQUEUE(C, j), i) 


In a target implementation a function is defined by means of a single expression. For 
example, ENQUEUE is defined as: ENQUEUE(d, k)::= Rotate(Insert(d, k)). The 
transformation performed takes into consideration both of these differences. 


It should be noted that a preliminary implementation is an executable 


Fig. 18. An Implementation 
NULLQ() :: = Create() 


ENQUEUE(c, j) :: = Rotate(Insert(c, j)) 
FRONT(c) :: = Value(c) 

DEQUEUE(c) :: = Remove(c) 
APPEND(c, d) :: = Join(d, c) 


SIZE(c) ::= if Empty(c) then O 
else SIZE(Remove(c)) + 1 


AQ 


_ y implementation. It can be executed by an interpreter that simplifies algebraic expressions 

. using the rewrite rules in the preliminary implementation and the specifications of the 

. implementing types. The interpreter must have a pattern matching capability to invoke the 

: appropriate rewrite rule while simplifying an expression. The program verification system 
AFFIRM [39], and the programming system PROLOG [??] provide such an interpreter. 
Given the specifications of all the implementing types, the interpreter can execute the 
preliminary implementation on any given input. For example, the value returned by the 
operation (of Queue_Int) Front on the queue constructed by Enqueue(Nullq, 1) is obtained 
by finding the normal form of FRONT(ENQUEUE(NULLQ( ), 1)) using the preliminary 

- implementation: ‘The normal form is 1. Depending on the range type of the operation, the 
normal form can, in general, be a generator constant of any of the implementing types. The 
normal form can then be evaluated assuming there exist implementations for the 
implementing types. | 

Our goal is to derive the target implementation in a form that can be compiled by a 
compiler for an applicative language. The motivation for this is primarily one of efficiency. 
There are two reasons why a target implementation is more efficient than a preliminary 
implementation. The first one arises because of the freedom to use nongenerators of the 
representation type in a target implementation. This enables one, in some instances, to 
eliminate recursion from the preliminary implementation of an operation, and to transform it 
into a target implementation which is merely a composition of the operations of the 
implementing types. The implementation of ENQUEUE shown above is an instance of such 
a situation. The use of the operation Rotate in the target implementation eliminates the 
recursion which was essential in the preliminary implementation. The second reason is that 
an implementation that can be compiled by means of a conventional compiler is in general 
more efficient than interpreting a set of rewrite rules, 

We develop two methods of deriving a target implementation from a preliminary 
implementation: The Recursion Preserving Method, and the Recursion Eliminating Method. 
Both the methods are based upon expansion using rewrite rules. The target implementations 
derived by the first method preserve any recursion that may exist in the corresponding 


preliminary implementations. The second method can eliminate recursion from a 
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: preliminary implementation of an operation if there exists a nonrecursive implementation for 
‘the operation. The second method is more general because it can also derive the 
implementations derived by the first: method. The advantage of the first method is that it is, 
. in general, faster than the second in situations where the two methods derive the same target 


implementation. 
6.1. The Recursion Preserving Method 


This method uses a special set of functions, called the inverting functions, on the 
implementing types for transforming a preliminary implementation into a target 
implementation. To understand what inverting functions are and how they are useful in 
deriving a target implementation, let us take a closer look at the difference in the function 
definition methods used by the’ two forms of implementation. The preliminary 


implementation for SIZE is 


SIZE(Create) — 0 
SIZE(Insert(c, i)) + SIZE(c) + 1, 


and a possible target implementation for it is 


SIZE(d) :: = if Empty(d) then 0 
else SIZE(Remove(d)) + 1. 


In the preliminary implementation, the argument to SIZE on the left hand side of a 
rule may be a generator expression. The argument indicates the structure of the expression 
that constructs the values for which the rewrite rule is applicable. This freedom serves two 
purposes in a preliminary implementation. Firstly, it is used for performing a case analysis 
based on the structure of the argument. Secondly, the explicit indication of the structure of 
the arguments on the left hand side makes the decomposition of the arguments trivial. For 
instance, in the second rewrite rule for SIZE the variable c used on the right hand side is 
actually a component of the argument to SIZE. We were able to access this component 
without actually having to generate code to decompose the argument. 

In a target implementation, the argument to SIZE on the left hand side of the 
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: definition is a variable. This means that the expression on the right hand side of the 
definition must have explicit pieces of “code" to perform the case analysis based on the 
_ Structure of the argument, and to decompose the argument. For instance, in the target 
_ implementation of SIZE given above, the subexpression Remove(d) extracts the component 
of the argument d that is denoted by the variable c in the preliminary implementation. The 
_ subexpression Empty(d) checks if d is a value constructed by Create; the if_then_else 
expression performs the desired case analysis. Let us call the subexpressions that perform 
: these functions mentioned above inverting expressions: . 

A preliminary implementation can be systematically transformed into a target 
implementation if the inverting expressions can be generated automatically. The inverting 
functions of the implementing types serve precisely this purpose. For instance, in the above 
example Remove and Empty are two of the inverting functions for Circ_List. The inverting 
expressions can be automatically derived in terms of the inverting functions. Thus, the 
transformation of a preliminary implementation into a target implementation according to 

_this method consists of two steps: First, determine the inverting expressions in terms of the 
inverting functions; second, derive implementations for the inverting functions in terms of 
the operations of the implementing types. The two subsections to follow describe the two 


steps. 
6.1.1 Inverting Functions and Inverting Expressions 


Inverting functions”? of a data type are a family of functions on the type that are 
inter-related in a special way. Inverting functions are defined with respect to a basis of the 
type. The relationship among the inverting functions of a family is such that the functions 
can be used to algorithmically invert the process of constructing a value from the generators 
of the type. In other words, it is possible to construct algorithmically the inverting 


23. Inverting functions are related to distinguished functions defined in [24]. A family of inverting 
functions for a data type can also serve as a family of distinguished functions. The reverse implication 
is not true in general. In [24] distinguished functions are used to formalize the expressive power of a 
data type. 
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» expressions as a composition of appropriate inverting functions. The inverting expressions 


' perform the following functions: 


(1) Given a variable v and a generator expression t, check if the value denoted by v can 
be constructed by a generator expression that has the form of t. Since an inverting 
expression that performs this function is normally a boolean expression, we call it a 


boolean inverting expression. 


(2) Assuming that a giyen variable v denotes a value that is constructed by an expression 
that has the form of a given generator expression t, determine the various 
components of t from v. We call an inverting expression that performs this function 
a component inverting expression since it extracts a component of a generator 


expression. 


For example, the operations Remove, Value, and ~(Empty) can serve as a family of 
inverting functions for Circ_List. This is because the inverting expressions for any generator 
expression of Circ_List can be automatically constructed from these operations. For instance, 

suppose v is a variable of type Circ_List, and t = Insert(Insert(c, i), j) is the generator 


expression under consideration. The following are some of the inverting expressions for t: 


Q) Not(Empty(Removet))) is a boolean inverting expression for t. It checks if v 


denotes a value constructed by a generator expression that has the form of t. 


(2) Some of the component inverting expressions of t are Value(v) which extracts j, 
Remove(Remove(y)) which extracts c, and Value(Remove(v)) which extracts 1. 


Let us now formalize the properties that characterize a family of inverting functions 
for an arbitrary data type. We express the properties in the form of rewrite rules. The 
| properties are such that they do not necessarily characterize a unique set of functions. This is 
done deliberately to offer flexibility in choosing an implementation for the inverting 
functions. Inverting functions are always defined with respect to a basis for the data type. 
Let the basis for the data type be & = {o,| {0}. Inverting functions can be classified into 

~ two categories: the component inverting functions and the boolean inverting functions. 
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(1) There is a set of n component inverting functions (d,, vway ») associated with every 

generator o, in the basis whose arity ism. They are characterized by the following 
property: ; 
o(d(o{v,,-.-; vs +2050 (of¥,,---,¥ ))) 4 OV, +005 v) . 
A generator whose arity is zero does not have any associated component inverting 
functions. The component inverting functions associated with o, factor a value 
constructed by o,. They return the arguments used by a, in constructing the value. 
At the outset it may appear more natural to characterize the component inverting 
functions as follows: d(o(v,, eee Vv») + Ve The problem with such a 
characterization is that it may result in ill-defined component inverting functions in 
situations where the generators can be used in more than one way to construct the 
same value. For instance, consider the basis 8 = {0, 1, +} for Natural_Numbers. 
If d, associated with + is defined as d,(x-+y)—+x, then we have a situation where 
d(0+ 1) = 0 and da +0) = 1. This will conflict with the rest of the specification 
of type Natural_Numbers which should allow us to prove that (0+1) = (1+0). 


(2) There is a boolean inverting function associated with every generator in the basis. 
The boolean inverting function, p,, associated with a generator o, returns True on 
values that can be constructed by a generator expression that has the form 
ofv,, say y,). So, p, is characterized by p{v) — o(d,(v),...,d.(¥)) =v, where = 
is the equality operation on the type. Thus, the recursion preserving method in 
general applies only when each of the implementing types has the equal operation 
defined on it. A simpler characterization, which applies only when the basis is such 
that every value of the type can be constructed uniquely using the generators is as 


follows: 
p(o(¥,, Sins v)) — True. 
pfo(v,, eaters v)) -+ False (i ¥ j) 


The basis for Cire_List is & = {Create, Insert}. It has two component inverting 
functions (d, and d,) both of which are associated with Insert, and characterized by 
; Insert(d,(insert(y, i)), d(Insert(v, i))) — Insert(y, i). It has two boolean inverting functions, p, 
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and p,, one associated with Create and the other associated with: Insert. They are 
_ characterized as follows. (Note that the generators of Circ_List are such that every circular list 


can be constructed uniquely in terms of the generators.) 


p, (Create) — True 
p, (Insert(c, i)) + False 


p,(Insert(c, i)) + True 
p,(Create) — False 


Notice that p, and P,, in this case, are complement of each other. So, while deriving 
implementations for the inverting functions, we implement only p,; p, is obtained as a 
negation of P, 

It is not hard to see how a preliminary implementation can be transformed into a 
target implementation in terms of the inverting functions. Fig. 19 gives a general procedure 
that does it for an arbitrary preliminary implementation. In the following, we illustrate the 
procedure on the preliminary implementation of SIZE. The preliminary implementation 


SIZE consists of the following rewrite rules. 


SIZE(Creatc; — 0 
SIZE(Insert(c, i)) + SIZE(c) + 1 


Suppose the left hand side of the target implementation is SIZE(v). The expression on the 
right hand side is a nested if_then_else expression that performs a case analysis. There is a 
case corresponding to every rewrite rule in the preliminary implementation. In the present 


case the right hand side would have the following form: 

ifb, thene, 

else if b, then e, 
The expressions b, and e, are determined from the first rewrite rule using the inverting 
expressions associated with the generator expression that appears as the argument to SIZE on 


the left hand side of the rewrite rule. The expressions b, and e, are determined similarly from 


the second rewrite rule. We will Jescribe how b, and e, are determined since they are more 
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Fig. 19. The Procedure RPM | : 


Suppose the preliminary implementation of F consists of the following rules: 


Fig) > ¢, 
K(g,) >t, 


a 
Kg )—t, 
Then, the target implementation for F is 


F(v) ::= if b, thens, © 
else if b, then s, 
] 


e 
° 


else if db, then S. 


where 
(1) b, is the boolean inverting expression of g, which is obtained by the procedure BIE described 
below. 
(2) s, is the expression obtained by replacing every terminal in t, by the component inverting 
expression of g that extracts te terminal. This is obtained by the procedure CIE described 
below. 


For convenience, we assume that the generators have an arity that is at 
most one. 


CIE = proc (a: generator expression, u: Occurrence) 
returns (component inverting expression) 


Suppose a is o(a,) 


dis the d-function associated with o 


if u = A then return(A) 
else if u = 1.v then return(d ° CIE(a,, v) 


end CIE 


BIE = proc (a: generator expression) returns (boolean inverting expression) 


if a is a variable then return(A) 
else if a = o(a,) 
then return(p ° A © BIEI(q,, d)) 
where p is the boolean inverting function associated with o 


a 


d is the component inverting function associated with o 


BIF1 = proc (a: genarator expression, d: inverting function symbol) 


returns ( beolean inverting expression) 


if a is a variable then return{A) 
else if a = of(a,) 
then return (p ° d © BIF(a,)) 


where p is the boolean inverting function associated with o 


- 130- 


interesting than the determination of b, and e,. b, is the expression that determines if'v 
denotes a value constructed from an expression that has the form of Insert(c, i), so b, is p,(v). 
; e, is identical to SIZE(c) + 1 except for the following modification: The variables ¢ and i, 
which denote the components of the expression appearing as argument to SIZE on the left 
hand side of the rule, are replaced by the corresponding inverting expressions that extract 
. those components from v. That is, ¢ is replaced by d,(v) and i is replaced by d,(¥). So, e, is 
SIZE(d i) +1. b, and e, can be determined similarly. So the target implementation for 


SIZE in terms of the inverting functions is below: 


SIZE(y) ::= if p,(v) then 0 
else if p,(v) then SIZE(d,(v)) +1 


6.1.2 Implementations for the Inverting Functions 


Implementations for the inverting functions are derived using the recursion 
eliminating method described in the next section. Note that the properties characterizing the 
inverting functions are expressed by means of a set of rewrite rules. Implementations for the 
inverting functions are determined by searching for appropriate compositions of the 
operations of the implementing types that satisfy the rewrite rules characterizing the inverting 
functions. In the following we show the theorem generation sequences that derive 
implementations for each of the inverting functions used above. 

Derivation for d, and d, 


Relevant Rewrite Rules used for Expansion 


aveucee 


(1) Value(Create) + ERROR 
(2) Value(Insert(c, i)) — i 

(3) Remove(Create) + ERROR 
(4) Remove(Insert(c, i)) + c 


sencsses 


Form of the theorem to be generated: Insert(y, i) = Insert(f* (Insert(y, i), f* (Insert(v, i))) 
Normal form of Insert(v, i): Insert(v, i) 
Rules used for the normal form: None 
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‘Step (1) Invoke Synthesis Rule (1) on 0 
Insert(v, i) = Insert(y, i) 


Step (2) Expand Expression: ¥ 
Using Rule: (4) 


Insert(y, i) = Insert(Remove(Insert(y, i)), i) 


Step (3) Expand Expression: i 
Using Rule: (2) 


Insert(v, i) = Insert(Remove(Insert(y, i)), Value(Insert(y, i))) 


The above theorem determines the following solutions for ah and age Remove and Value. Therefore, 
we have the following implementations for d, and d,. 


d(v) = Remove(y) 
d.(v) :: = Value(v) 


Derivation for p, 


Relevant Rewrite Rules used for Expansion 


(8) Empty(Create) — true 
(9) Empty(Insert(c, i)) — false 


eoveceee 


Form of the theorem to be generated: .True = f*(Create)) 
Normal form of True: True 
Rules used for the normal form: None 


Step (1) Invoke Synthesis Rule (1) on True 
True = True 


Step 2) Expand Expression: True 
Using Rule: (8) 
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Nee re erry 


True = Empty(Create) 


' The last theorem determines the following solution for f*: Empty. Note that this function also satisfies 
_ the other rewrite rule characterizing p,, namely p, (Insert( c,i)) — False. Therefore, p, can be 
implemented as follows: 


p,(v) :: = Empty(v) 
6.2 The Recursion Eliminating Method 


Let us suppose we are deriving a target implementation for an implementing 
function F whose preliminary implementation consists of the set of rewrite rules given below. 


Fg) t, 


Fe.) +, 
We assume that F is a single variable function for convenience. The general description of 
the method given below can be extended easily to a multivariable function. In a target 
implementation, the function F is defined as F(v) ::= e, where v is a variable, and e is an 


expression containing ¥ and any of the following function symbols: 
(1) Operations of the implementing types 
(2) The implementing functions 
(3) The function if_then_else 


Let us denote e as f*(v), where f* is some composition of the function symbols listed 
above. The derivation of a target implementation consists of finding a suitable f*. The 
composition f* should be such that the function defined by F(v) :: = f*(v) has the same 
behavior as the one . defined by the: set of rewrite rules given above. 

To characterize the problem formally, we define the following concept. A 
- composition f* satisfies a rewrite rule of F if the equation obtained by substituting f* for F on 
both the sides of the rewrite rule is a theorem of the rewriting system consisting of the 


Pigae 


_ preliminary implementation and the specifications of the implementing types. For example, 
the composition Rotate(Insert(d, k)) satisfies the rule 
:» ENQUEUE(Insert(c, i), j) + Insert(ENQUEUE(, j), i) if the equation 
: Rotate(Insert(Insert(c, i), j)) = Insert(Rotate(Insert(c, j)), i) is a theorem. 

The composition f* to be derived should be such that f satisfies each of the rewrite 
rules in the preliminary implementation of F. That is, the following equations should be 
theorems. (The notation t,[F -- f*] denotes. the expression obtained by replacing F by f* in 
t,.) | 


- (2) =t[F - f] 


f(g.) S t [F - f] 

The purpose of the above formulation (of the condition that a solution for f* is 
supposed to satisfy) is to allow us to use a theorem generation strategy similar to the one used 
in deriving a preliminary implementation. We generate a theorem using one of the above 
equations as a template by treating f* as a place holder in the equation. Let us call this 
equation the template equation. A theorem that has the form of the template equation 
determines a candidate for f*. A single theorem may determine more than one candidate for 
_ f*, but only finitely many, because the expressions we are dealing with have finite size. The 
candidate(s) can be determined automatically by comparing the theorem with the template 
equation. The goal is to generate a theorem that not only has the form of the template 
equation but is also such that the candidate for {* satisfies the rest of the equations in the 
preliminary implementation of F. 

The generation of theorems is carried out in the same fashion as in deriving the 
preliminary implementation. We use the same set of synthesis rules developed earlier. The 
theorems that are of interest to us in the present situation involve only the operations of the 
implementing types and the implementing functions. Therefore, the rewriting system that is 
used for performing expansion (while generating the theorems) consists of the preliminary 
implementation and the specifications of the implementing types. In contrast, the rewriting 
system used in the derivation of the preliminary implementation consisted of the 
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. specifications of the implemented type and the association specification. Note that the 
preliminary implementation did not exist at that time. Checking if a candidate for f* satisfies 
, the rewrite rules essentially involves checking if an equation is a theorem. 

Let us illustrate the method on the derivation of the target implementation for 
ENQUEUE shown earlier. The preliminary implementation of ENQUEUE is repeated 


below for ease of reference. 


ENQUEUK(Create, j) — Insert(Create, j) 
ENQUEUFK(Insert(c, i), j) + Insert(ENQUEUE(, j), i) 


The f* to be derived should be such that the following equations are theorems. (Note that the 
equations are obtained by replacing ENQUEUE by f* in the rewrite rules, and then 


interchanging the two sides. The reason for interchanging the sides will be explained shortly.) 


qd) Insert(Create, j) = f*(Create, j) 
(2) Insert(f*(c, j), i) = f*(Insert(c, i), j) 


We use equation (1) as the template equation. The nature of our synthesis rules imposes ° 
certain restrictions on the equations that can be used as template. The synthesis rules are 
formulated to generate theorems with a known left hand side, but an unknown right hand 
side. So, the template equation should be such that the unknown entity f* appears only on 
the right hand side. In equation (2) both sides are unknown since f* occurs on both the sides. 
This was also the reason behind interchanging the two sides of the rewrite rules while 
obtaining the above equations. Note that there always exists at least one equation with a 
known right hand side. This corresponds to the rewrite rule in the preliminary 
implementation of F that represents the basis case. 

Shown below is a sequence of steps that generates a theorem that gives rise to a 


target implementation. 


Relevant Rewrite Rules used for Expansion 


(3) Rotate(Create) — Create 
(4) Rotate(Insert(Create, i)) ++ Insert(Create, i) 
(5) Rotate(Insert(Insert(c, i1), i2)) + Insert(Rotate{Insert(c, i2)), i1) 
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Form of the theorem to be generated: Insert(Create, j) = f*(Create, j) 
Normal form of Insert(Create, j): Insert(Create, j) 
Rules used for the normal form: None 


Step (1) Invoke Synthesis Rule (1) on Insert(Create, j) 
Insert(Create, j) = Insert(Create, j) 


Step (2) Expand Expression: Insert(Create, j) 
Using Rule: (4) 


Insert(Create, j) = Rotate(Insert(Create, j) 


The right hand side of the last theorem generated in the above series has the form of 
f*(Create, j), and hence can be used to generate a set of candidate compositions. A candidate 


composition is determined from three expressions: 
_-()): the left hand side of the faigek implementation, say F(v,,..-, ¥,) 
(2) the right hand side of the theorem generated, say a, and 
(3) the right hand side of the template equation, say f*(g,, ..., g,). 


It is obtained by replacing zero or more occurrences of g, for every 1 <i<n, in a by a 
variable Ys l<j<n. The replacement of g, by yj is made so that type consistency is 
preserved. 

For the current example, the left hand side of the target implementation is 
ENQUEUEQ, k) ::= ?; the right hand side of the theorem generated is Rotate(Insert(Create, 
j); the right hand side of the template equation is *(Create, j). So, there are two candidates 
for f*(d, k): (1) Rotate(Insert(d, k)) and (2) Rotate(Insert(Create, k). 

The second candidate does not satisfy equation(2). The equation obtained by 
replacing f* in the equation by the candidate is 
Insert(Rotate(Insert(Create, j)), i) = Rotate(Insert(Create, j)). This is not a theorem of 
Cire_List because (for every i and j) both the sides of the equation remain simplified, but will 
not be identical. (This can be checked by Is-an-inductive-theorem-of.) 
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Let us consider the first candidate. The equation obtained by substituting it for f* in 
- equation (2) is Rotate(Insert(Insert(c, i), j)) = Insert(Rotate(Insert(c, j)), i), and this is a 
. theorem of Circ_List. (The left hand side of the equation reduces to the right hand side by 
the rewrite rule (5).) Hence Rotate(Insert(d, k)) satisfies equation (2). The second candidate 


does not satisfy equation (2). Hence the target implementation is: 


ENQUEUEG, k) ::= Rotate(Insert(d, k)) 
6.3 An Illustration of a Complete Synthesis 


In the following, we illustrate the complete synthesis, i.e., an illustration of both the 
stages, of two examples. The first one derives a target implementation for the operation 
Append of Queue_Int using the association specification that specifies the Circ_List 
representation, The second example derives a target implementation for the Front using the 
association specification that specifies the <Array_Int X Integer X Integer> representation 
(see chapter 5). 


THustration 1 
Stage 1: 


Partial Preliminary Implementation of Append at Hand 
APPEND(c, Create) — ?rhs, 
APPEND(c, Insert(d,i)) —+ rhs, 


Relevant Rewrite Rules of the Perturbed World 


(10) Append(q, Nullq) — q 

(14) 36(Create) —+ Nullq 

(20) JE(ENQUEUE(S, i)) + Enqueue(46(c), 36(i))})_ 
(22) 36(APPENIXc, d)) — Append(36(c), 36(d)) 


Derivation of the first rewrite rule 


Form of the theorem to be generated: 36(APPEND(c, Create)) = J6(?rhs,) 
Normal form of 36(APPENDM\c, Create)): 36(c) 
" Rules used for the normal form: (22), (14), (10) 


- Step (1) Invoke Synthesis Rule(1) on 36(APPEND(c, Create)) 
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36(APPEND(c, Create)) = 36(c) 


The above theorem is such that APPEND(c, Create) > c. Therefore the desired rewrite rule is: 
APPEND(c, Create) — c 


Derivation of the second rewrite rule 


Form of the theorem to be generated: JG(APPEND(c, Insert(Create, ))) = J6(?rhs,) 
Normal form of 36(APPEND(c, Insert(Create, 1))): Enqueue(J6(c), 36()) 
Rules used for the normal form: 


Step (1) Invoke Synthesis Rule (1) on 36(APPEND(c, Insert(Create, 1))) 
36(APPEND\(c, Insert(Create, 1))) = Enqueue(36(c), 36()) 


Step (2) Expand Expression: Enqueue(36(0), 36()) 
Using Rule: (10) 


36(APPEND(c, iisert(Create, ))= Appena(Caquenct30(c 36())), Nullg) 


Step (3) Expand Expression: Nullq 
- Using Rule: (14) 


CRO New ater Senn eee nee Re ReRenE Ease meres seene es cEEsenersessoraneense 


36(APPEND(c, Insert(Create, i))) = Append(Enqueue(36(c), 36()), 36(Create)) 


‘Step (4) Expand Expression: Enqueue(36(c), 36() 


36(APPENID\(c, Insert(Create, ))) = Append(}4(ENQUEUE(c, 3), 36(Create)) 


Step (5) Expand Expression: Append(JG(ENQUEUE(C, ’)), 36(Create)) 
Using Rule: (22) 


Swe sewEe ee Se ern e Be esse Rene Rs Neen De TeERT Ee SES eRsesetensesensecesseeen 


36(APPEND(c, Insert(Create, ))) = (APPENIKENQUEUE(c, i), Create)) 
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r Step (6) Generalize the theorem in step (5) by replacing the constant 
Create by the variable d to obtain the following equation: 
36(APPEND(c, Insert(d,i ))) = JG(APPEND(ENQUEUE(., 3, @) 


Apply Is-an-inductive theorem-of on the above equation. 
This yields True confirming that the equation is a theorem. 


_ Hence the desired rule (obtained by dropping 36 on both sides) is: 
APPEND(c, Insert(d,i)) - APPEND(ENQUEUE(,, i), d) 


Stage 2: 

Preliminary Implementation at Hand 

APPENI\(c, Create) - ¢ 

APPENI\(c, Insert(d,i )) - APPEND(ENQUEUE(c, 0), ad 
Desired Form of Target Implementation 


APPEND\,, ¥,) 3: = ”? 


Relevant Rules of Circ_list 


(10) Join(c, Create) + c 
(11) Join(c, Insert(d, i)) — Insert(Join(c, d), i) 


Template Equation Chosen: c= APPEND(c, Create) 
Form of the theorem to be generated: c = f*(c, Create) 
Normal form ofc: c 

Rules used for the normal form: None 


Step (1) Invoke Synthesis Rule (1) on c 
c=c 


Step (2) Expand Expression: c 
Using Rule: (10) 


mee ROR S Ons PEERS STE eE NS SESE Rees enenEEemasTeSEEesSesesEeDeEPEeEE EST 


c = Join(c, Create) 


Step (3) Find a suitable candidate composition. 


The right hand side of the above theorcin has the form of F*(c, Create). So, find a suitable candidate 
composition. ‘Phere are two possibilities: (1) Joialy,, v,). and (2) Join(y,, VE. The second candidate 
satisfies the second rule of the preliminary implementation, but the first docs not. So, a possible target 
implementation 1s: 

APPENDG,, Vos Join(y,, v) 
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Illustration 2 
Stage 1: 

Partial Preliminary Implementation of Append 
FRONT(<», i, 2) — ?rhs, 


FRONT (<Assign(y, ¢, ), i, i+ 1) — ?rhs, 
FRONT(<Assign(Assign(\,¢,, /), &) /+1), 4 j+2>) — ?rhs, 


Relevant Rewrite Rules of the Perturbed World 


(1) 36(<vy, i, i>) 4 Nullg 
(2) 3G(<Assign(y, e, j), i, j+1>) — if_then_else(i = j+1, Nullg, Enqueue(36(<y, i, j>), 36(e))) 


(3) 36(FRONT(x)) — Front(36(x)) 
(4) 3G(ERROR) — Error 
(5) J6(if_then_else(b, v v,)) — if_then_else(b, J6(¥,), JE(v,)) 


Y 


Derivation of the first rewrite rule 


Form of the theorem to be generated: JG(FRONT(<), i, D)) = J6(?rhs,) 
36(FRONT(<)y, i, D))+: Error 
Rules used for simplification: 


Step (1) Invoke Synthesis Rule (1) on J6(FRONT(<»y, i, D)) 
JG(FRONT(<y, § >)) = Error 


Step (2) Expand Expression: Error 
Using Rule: (4) 


6(FRONT(<,, i, )) = 36(ERROR) 


FRONT(<), 4 >) —+ ERROR © 


Derivation of the second rewrite rule 


Form of the theorem to be generated: J6(FRONT(<Assign(y, e, ), i, i+1>)) = J6(?rhs,) 
J6(FRONT(<Assign(y, e,),ii+bD))b: Be 
Rules used for simplification: 
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- Step (1) Invoke Synthesis Rule (1) on (FRONT (<Assign(y, e, ), 4 i+ 1>)) 
(FRONT (Assign), e, ), i, i+ 1>)) = J(e) 


FRONT(<Assign(y, e, 2), i, i+1>) + € 


Derivation of the third rewrite rule 


Form of the theorem to be generated: J6(FRONT(<Assign(Assign(y,¢,, j), ¢, J+ 1), § j+2>)) = J6(?rhs,) 
J6(FRONT(<Assign(Assign(y,,, J), €j+1), 5 j+2>))4: 
if_then_else(i = j+2, Error, if_then_else(i = j+1, 3(e), 
Front(Enqueue(36(< v, 4 >), e))) 
Rules used for simplification: 


Step (1) Invoke Synthesis Rule (1) 
J(FRONT(<Assign(Assign(y,e,, ), ¢,, +1), 4 i+ 2>)= 
if_then_else(i = j+2, Error, if_then_else(i = j+1, 36(e,), 
Front(Enqueue(J6(<v, i, >), e)))) 


Step (2) Expand Expression: Front(Enqueuc(J6(<,y, i, j>), €,)) 
Using Rule: (2), Protocol 3 
TW Update: 
i= j+2— False 
i= j+1— False 
J6(FRONT(<Assign(Assign(y,e,, ), €,,j+1), 4 j+2>)) = 
if_then_else(i = j+2, Error, if_then_else(i = j+ 1, 36(e,), 
Front(J6(<Assign(y, ¢,, J), 4 j+1>)))) 


"Step (3) Expand Expression: 36(<Assign(», ¢,,.), é j+1>) 
Using Rule: (3) 
I6(FRONT(<Assign(Assign(e,,), ¢y i+ 1), 6 j+2>)) = 
if_then_else(i = j+2, Error, if_then_else(i = j+1, J6(e,), 
IG(FRONT(<Assign(», ¢,,), 4 j+1>)) 
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Step (4) Expand Expression: Error 
Using Rule: (4) 
36(FRONT(<Assign(Assign(y,e,, /), @» J+1), § j-+2>)) = 
if_then_else(i = j+ 2, J6(ERROR), if_then_else(i = j+1, 16(e,), 
J6(FRONT(<Assign(y, ¢,, J), 6 + 1>)))) 


Step (5) Expand Expression: if_then_else(i = j+ 2, J6(ERROR), if_then_else(i = j+1, 36(e,), 
J6(FRONT(<Assign(y, e,, J), 4 + 1>)))) 
Using Rule: (5) 
J6(FRONT(<Assign(Assign(,¢,, /), ¢,,j+ 1), 4 j+2>)) = 
J6(if_then_else(i = j+2, ERROR, if_then_else(i = j+1, e, 
FRONT(<Assign( yy, es D, bj+)))) 


FRONT (<Assign(Assign(v,e CDs Oy J+ 1), 5.j+2>)—- 
if_then_else(i = j+2, ERROR, if_then_else({i = j+1, 2, 
FRONT(<Assign(,y, ¢,, /), 4 J+ 1>))) 


Stage 2: 
Preliminary Implementation at Hand 


FRONT(X), 4 Dd) — ERROR 
FRONT (<Assign(yy, ¢, i), 4 i+ 1>) + 
FRONT (<Assign(Assign(y,¢,, Ds ey J+ 1), 5 j+2>) — if i = j+2 then ERROR 
else if i = j+1 then e, 
else FRONT(Assign( ¥, Es e.D, ijt 1>) 


Let FRONT (<arr, pnti, pnt2>) be the left hand side of the target implementation. We use a slightly 
different method than the one normally used for deriving the target implementation for Front. We use 
combination of the recursion preserving method and the recursion eliminating method. First, a 
composition that satisfies the first rewrite rule is determined separately; it is easy to see that this can be 
ERROR. Then, a composition that satisfies the second and the third rewrite rules is determined. The 
two compositions are then combined with the help of a boolean inverting expression to arrive at the 
target implementation. Note that the boolean inverting expression that characterizes the argument 
structure corresponding to the first rewrite rule is pnt! = pnt2. Therefore, tle desired form of the 
target implementation is as below. The expression that takes the place of the else clause is to be 
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determined so that the second and the third rewrite rules are satisfied. 


Desired Form of the Target Implementation 


FRONT(Karr, pntl, pnt2>) ::= if pnt! = pnt2 then ERROR 
else ?? 


Relevant Rewrite Rules of Array_Int and Array_Int X Integer XInteger 


The first two rules specify the Read operation of Array_Int that reads an element of an array. The third 
rewrite rule specifies the operation of a triple that selects the first component. 


(1) Read(Wullarray, i) — ERROR 
(2) Read(Assign(v, e, j), i) — ifi = jthene 
else Read(y, i) 


(3) First(<v, k, D) > v 


Template equation chosen: e = FRONT(<Assign(), e, 9, i i+ 1) 
Form of the theorem to be generated: e = f*(<Assign(y, ¢, ), i, i+1>) 
Normal form of e: e 

Rules used for simplification: None 


Step (1) Invoke Synthesis Rule (1) on e 


e=e 


Step (2) Expand Expression: e 
Using Rule: (2), Protocol 2 


Se ease eS Beene see BBs AS EEs SHOT E SEE NEES EERE EASE HESS EAeNHereREERNERe 


e = Read(Assign(y, e, i), i) 


Step (3) Expand Expression: Assign(y, e, i) 
Using Rule: (3) 


Se eRe ee ROR OES ARR ERAT RES See RE SeE EER Rees eee armen Re EE eeeeS 


e = Read(First(<Assign(y, e, i), k, D), i) 


Step (4) Replace variables in the theorem by appropriate terminals: 
Yroviteikreilrey i+] 
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e = Read(First(<Assign(y, e, i), £i+1>),) 


The right hand side of the last theorem generated has the form of f*(<Assign(y, e, i), i, i+1>). It 


; determines the candidate composition Read(First(<arr, pnt1, pnt2>), pntl), which’ can be simplified to 
- Read(arr, pnt!). This composition is such that when it is takes the place of 7? in the partial target 


implementation shown above, the whole expression satisfies the third rewrite rule in the preliminary 


. implementation. Hence, the a possible target implementation for FRONT is: 


FRONT (arr, pnt1, pnt2>) ::= if ppatl = pnt2 then ERROR 
else Read(arr, pnt1) 
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Ae Conclusions and Future Research 


Algebraic specifications for data types have been extensively used to prove 
properties of data types and to establish the correctness of implementations of data types. In 
_ this thesis we have investigated the task of automatically synthesizing implementations for 
abstract data types starting from their algebraic specifications. In this chapter we summarize 
the major contributions of the thesis, describe the important conclusions the research has lead 
us to, and provide directions for further research. 

One of the main decisions that we were confronted with at the start of the research 
was choosing and characterizing the inputs to the synthesis procedure. It is not surprising to 
expect as inputs the specification of the implemented type, and the specifications of all the 
implementing types. The novelty of our method lies in the use of two other inputs: the 
homomorphism information and the termination ordering. The advantages of having them 
as inputs became more evident as the research progressed. 

The homomorphism information makes the problem more tractable by restricting 
the space to be searched in finding an implementation because it imposes additional 
constraints on the synthesis equations (see chapter 4). It is informative in this respect to 
compare our method with. that of Okrent’s [40]. The method developed in [40] can also be 
reformulated as a theorem generation activity within our framework. His method, however, 
is less general and less efficient than ours because he does not use the homomorphism 
information. In order to compensate for the lack of this information he is forced to severely 
restrict the form of the specifications. 

The termination ordering. is not essential but is useful for automating the synthesis 
procedure. The basic method of manipulation used by the synthesis procedure is expansion 
(see section 4.4.1 and 4.5). Expansion, unlike reduction, is not uniformly terminating -- even 
when the specifications are convergent (see section 3.3). This makes the synthesis procedure 
potentially nonterminating. The termination ordering circumvents this problem. It also 
ensures the termination of the implementation derived. The synthesis method used by 
Darlington [7] does not explicitly indicate the use of any termination ordering. This is one of 


the reasons that the issue of termination (that of the synthesis procedure, or that of 
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3 implementation derived) is not addressed in [7]. : 
An important contribution of the thesis is the development of a formal basis for the 
~ method used by the synthesis procedure. The development is influenced significantly by the 
techniques used for verifying the correctness of implementations of algebraically specified 
. data types. The synthesis method has two distinguishing features. The first is that it is based 
on the general principle of reversing the techniques of program verification. The second is 
the decomposition of the procedure into two stages. 

The reverse program verification principle lead us to view the synthesis problem 
(see chapter 4) as one of generating a set of theorems that satisfy the synthesis conditions. 
The synthesis conditions characterize the situations in which a set of theorems of the input 
specifications is guaranteed to yield a correct implementation. The synthesis rules provide a 
means of generating theorems from a specification. This approach to synthesis has two 
advantages. Firstly, it makes the formal justification of the correctness of the synthesis 
method simple because the synthesis conditions are based on a criterion of correctness for 
data types. Secondly, it allows us to build on the research in the area of program verification - 
past as well as future. This naturally suggests an area in which to pursue future research. It 
concerns extending the theory in which the synthesis procedure operates. Currently it 
operates in the part of inductive theory of the specification that is decided by the Musser/ KB 
method (see chapter 4) of proving equational and inductive properties of rewriting systems. 
This extension would involve developing new synthesis rules, and new ways of using the 
synthesis rules for generating theorems. One might, for example, look into ways of 
assimilating the proof techniques used by various verifiers [5, 27] into our framework. 

Another advantage of decomposing the procedure into two stages is that it makes 
the procedure more modular. It isolates the part that is dependent on the target language. So 
modifications to the target language can be made without drastically affecting the synthesis 
procedure. A possible extension to the thesis that could be considered is to incorporate more 
equivalence preserving transformations into the second stage. The transformations can be 

either of an efficiency improving nature, or language developing nature such as applicative to 

| imperative transformations. 


In addition to characterizing the inputs, an important contribution of the thesis is 
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t the characterization of the generality of the synthesis method. The thesis formally 
. characterizes (see chapter 2 and section 3.3) the restrictions on the inputs, and the conditions 
: under which it succeeds in finding an implementation. This was possible primarily as a result 
- of the development of the formal basis for the synthesis method. 

Finally, but most importantly, let us address the question that any work on program 

synthesis has to confront: How far does the work go towards making the programmers task 
- superfluous? The practical utility of a work in program synthesis can be determined by 
_ evaluating the following aspects of the synthesis procedure: Efficiency of the synthesis 

method, efficiency of the implementations derived, and the ease of writing specifications. 

The main source of inefficiency in the synthesis procedure stems from the 
non-uniquely terminating nature of expansion. This forces (as shown in section 4.5) the 
procedure to keep track of all possible expansion paths. The implementation of the 
procedure given in section 4.5 uses only the most obvious ways in which unproductive paths 
can be pruned. There are several avenues for further research in this area. One can 
investigate the use of various heuristic approaches for cutting down unproductive paths. 
Another possibility is to make better use of the invariant information available in the 
association specification. The procedure (see chapter 5) currently uses it just as one of the 
conditions to terminate the theorem generation activity. A better utilization of it would be to 
guide the theorem generation activity. For instance, it would be more useful if it were 
possible to deduce from the invariant specification certain structural properties of expressions 

_ that prevented them from satisfying the invariant. This could then be used to discontinue 
unproductive expansion paths during theorem generation. It is hard to extract this kind of 
information from an algebraic specification of §. It would be interesting to consider other 
means of specifying 3 which can help this cause. 

The synthesis procedure currently does not take into consideration the efficiency of 
its output in synthesizing an implementation. It derives the implementations that it is capable 
of deriving in increasing order of complexity (in terms of the number of reduction steps 
needed) of the proof of the implementations. The are two reasons for this. Firstly, we know 
of no good ways of specifying performance constraints for operations of data types within an 
algebraic framework. Secondly, it was beyond the scope of the current work to incorporate 
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: automate performance analysis of the implementations. There is some recent work being 
done in this area in [50] that is compatible with algebraic theory of data types. It would be 
interesting to investigate the interaction between our work and that of {50}. 

| The main reason for choosing an equational language to sire the inputs was 
because of the benefits it offers from a proof theoretical point of view. Equational 
specifications have generally been found hard to write. This is one of the factors that reduces 
the practical value of the procedure, It would be useful to extend the synthesis procedure to 
accept specifications in a language that is easier to write. 

We believe that the goal of the research in program synthesis (and program 
verification) should not and cannot be to relieve the programmer completely of the burden of 
programming. Rather, it should be to help us gain a better insight into the science of 
programming. The insight gained can be utilized in several ways that are practically relevant, 
such as in the design of new programming languages, and in the development of program 
maintaining and program development [19, 49, 2, 3] systems. We believe that our work can 


be particularly useful in the latter area. 
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Appendix I - Equations as Rewrite Rules 


‘Automatic verification of data types that are specified equationally is often based on treating the 
equations in the specifications as rules for rewriting expressions that have certain patterns, The 
automation of our synthesis method also relies on such a treatment of the specifications. This appendix 
describes the basic concepts about rewrite rules, and some useful properties of sets of rewrite rules. 


We assume a denumerable set (¥) of elements called variables, and a finite set = of function symbols. _ 
We define expressions and constants over 2 as follows. (The formal definition is similar to the 
informal one given back in sec.3.3.1.) 


Expressions 


An expression is either (1) a variable, or (2) a function symbol f followed by a sequence of n > 0 
expressions Cee se. f is called the (main) function of this expression, and Cy - 025 @, are called the 
arguments... Such an expression is written fe, -+-,€,). An expression with no arguments is written 
as f(). We denote the set of expressions defined over 2 as E(Z). 


We assume it is possible to test variables and function symbols for equality. Two expressions a and B 
are regarded as identically equal (written a = £) if and only if they are both the same variable or they 
have the same main function symbol and the same number of identically equal arguments, in the same 
order. 


The variable set of an expression a is {a} if a is a variable, otherwise is the union of the variable sets 
of the arguments of a. 


The subexpressions of an expression are (1) the entire expression, and (2) the subexpressions of the 
arguments (if any) of the expression. Expressions which are variables have no expressions other than 
themselves, 


Constants 


‘A constant is an expression that does not contain any variables. We denote the set of constants over Z 
as T(X). The subconstants of a constant are (1) the entire constant, and (2) the subconstants of the 
arguments (if any) of the constant. 


Occurrences 


An expression can be represented naturally as a tree structure: The main function symbol of the 
expression is the root of the tree; the arguments of the expression are the branches of the tree. This 
analogy can be used to devise a notation to identify unambiguously the subexpressions of an 
expression. , 


An occurrence in an expression is a sequence (possibly empty) of positive integers that denotes the 
path inside the tree corresponding to the expression that runs from the root of the tree to the root of 
the tree corresponding to one of the subexpressions. We denote the set of all occurrences in an 
expression € by O(e). We use the following notation for denoting an occurrence: A is the empty 
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© occurrence, and if u is an occurrence and i is an integer, then i.u is the occurrence that has i at its head 
and 1 as its tail. 


_ The subexpression of an expression e at the occurrence u, denoted by e/u, is defined as follows: 
Ifu = A, thene/A =e 
Ifu = iw(1 <i<n), ande = fle, ++, @,), then e/u = €./W 
For example, suppose e = Enqueue(Dequeue(Nullq( )), i). Then e/1 = Dequeue(Nullq 0), 
e/2 = i, e/1.1 = Nullq(). 


Suppose uw is an occurrence of e. Then, we use the notation e[u+—e'] to denote the expression 
obtained by replacing in e the subexpression e/u by e'. For instance, suppose e is the same expression 
as in the example given above, and e" = Nullq(), then e[1 ~ e"] is Enqueue(Nullq( ), i). 


Substitutions 


Let o be a mapping from variables to expressions, such that o(v) = v for all but a finite number of 
variables y. Extend the domain of o to the set of all expressions by defining o(f(e,, see's e.)) to be 
f(a(e,), -++, 0(€,)). Such a mapping o is called a substitution (of expressions for variables). The 
notations = Iv, HH Ogee VE ei will be used to denote the substitution o such that o(V,) = €,, 
forl < i < n,and o(¥) = Vv. 


We say that an expression B has the form of an expression a if there exists a substitution o such that 
o(a) = 8. For example, Append(Nullg(), Enqueue(q,i)) has the form of 
Append(q1, Enqueue(q2, i2)) by the substitution 0 = [ql + NullqQ, q2 — q, i2 + i]. Notice that 
has the form of is not a symmetric relation. 


Rewrite Rules 


A rewrite rule is an ordered pair of expressions (L, R), such that the variable set of R is contained in 
the variable set of L. Usually (L, R) will be written L— R. A finite set of rewrite rules over a set of 
function symbols & is called a rewriting system over &. Let R be such a rewriting system. 


An expression a is reducible with respect to R if there is a rule L + R in R, and an occurrence u of a 
such that a/u has the form of L. Let o be a substitution such that o(L) = a/u, and 
B = alu o(R)]. Then we say that a directly reduces to B (using R), and write it as a —+ B (using 
R).. Where the particular R in use is clear from the context, this will be written simply asa — 8. Ifa 
is not reducible with respect to R, then we say a is irreducible with respect to R. 


Let —+* be the smallest relation on pairs of expressions which is the reflexive, transitive closure of —. 
Thus, a -+>* 8 if and only if there exist expressions aya,,.-., a,, where n > 0, such that a = ag, 


a,—a,,,fori=0,...,n-landa, = B. We read a —+* B as a reduces to B. 

Suppose a —+* B, and £ is irreducible. Then we say that a simplifies to B; B is called a normal form 
of a. We denote the normal form of e as e+. A rewriting system R has the unique termination 
property (UTP) if the simplifies relation defined by R is a function; that is, every expression has at 
most one normal form in R. 


A rewriting system R_ has the finite termination property (FTP) if there is no infinite sequence 


al S0' 


Qa, >». using R. 


A rewriting system R is convergent if it has FTP as well as UTP. In such a case, every expression in 
the system has exactly one normal form. 
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Appendix II - Checking Finite Termination 


A general technique for proving termination of a rewriting system R with an alphabet = is to 
demonstrate that it is possible to define a well-founded partial ordering >, on T(Z) so that t, + t, 


implies t, Ppt. A partial ordering is well-founded if there are no infinite descending sequences such 
as t, ~R t, ~R for any constants. Hence, there cannot be any infinite sequence of rewrites using R 


also. The following theorem [Manna&Ness} provides a useful guideline to define a suitable partial 
ordering to prove FTP. 


Theorem 3 A rewriting system R with an alphabet = satisfies FTP if there exists a well-founded partial 
ordering ;--, on T(%) with the properties given below. We call-a well-founded partial ordering that 


satisfies the following properties a termination ordering for the system R since the ordering can be 
used to show the termination of R. 


(1) Reduction: For every rule L-—R in R, and for every substitution o of variables to 
constants, o(L) >, o(R). 


(2) Substitution: t >, t' implies f(..t...) +p f(..t"...) for any constants ¢, t", f(t), f(t"... 
in T(2). 


The reduction condition asserts that applying any rule reduces the subterm to which the rule is applied 
in the well-founded ordering. The substitution condition guarantees that by reducing subterms the 

: 5 Vole , 
top-level constant is also reduced. Hence it follows thatt—+ t impliest > t . 


Fig. 20 gives a definition of a class of orderings called the /exicographic recursive path orderings (>). 
> is parameterized with respect to an ordering (>) on the alphabet of a rewriting system. In addition 
to the substitution property mentioned in the above theorem, > also contains the subterm relation: t, 


is a subterm of t, implies that t, > t,. Such an ordering is usually referred to as a simplification 


Fig. 20. The Lexicographic Recursive Path Ordering 


Let > be an ordering on an alphabet 2. Then > on 
T(2) is defined as follows: 


S > t iff one of the following conditions is true 
Q)f>gAs>t,l<gign 

(2)f = BA<S,,.--,S8, > >>), St... f > AS>t,1<iga 
3) Gs) [s;, = t Vs >t] 


>>o, is a right to left lexicographic ordering based on >-. It is defined as follows. 


€Spy 000 yS.> > ype Sys ee oy > iff 
(Ql<i<nbs,>tAWi<i<ns, = ty 
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: ordering [Dershowitz]. (A proof that‘ is a simplification ordering can be found in [Kamin].) 
Dershowitz in [Dershowitz] has shown the following theorem: 


_ Theorem 4 A lexicographic recursive path ordering (>) is well-founded if and aly if the underlying 
: alphabet ordering (>) is well-founded. 


One can, in general, use any suitable well-founded alphabet ordering in conjunction with a 
lexicographic recursive path ordering to use it as a termination ordering for a rewriting system. 

” Figures 21, and 22 give two alphabet orderings: The first can be used for an arbitrary data type 
specification, and the second for an arbitrary homomorphism specification. We refer to these 
orderings as the standard alphabet ordcrings for a data type specification, or a homomorphism 
specification, respectively. The orderings are based on a gencral method of structuring of the alphabets 
of a data type specification and a homomorphism specification. Assuming that there is no circularity in 
the defining_types relation on data types, it can be easily shown that the standard alphabet orderings 
are well-founded orderings. 


A lexicographic recursive path ordering based on an ‘alphabet ordering of Fig. 21 can serve as a 
termination ordering for the rewriting systems corresponding to Queue_Int and Circ_List. We leave 
it to the reader to convince for himself that > satisfies the reduction property in each of the two cases; 


Fig. 21. The Standard Alphabet Ordering for a Data Type Rewriting System 


Notations 


S is the rewriting system corresponding to TOI 
> is the alphabet of S 

Q is the operation set of TOI 

Q, is the set of generators of S 


Qi is the set of nongenerators of S 
_ 2 peor is the union of the alphabets of the rewriting systems of the defining types 
(We assume that the alphabets are mutually exclusive.) 


> is a partial ordering on the symbols in 


Definitions 
a= 2, U QB U Det 
Q = 2, U Ons 


> is defined as follows. It is assumed that a similarly defined ordering exists for each of the alphabets. 
in Zp > is assumed to contain each of these orderings. 


f> g iff one of the following conditions is true 
(Lg EQ, A arity ofg = 0, anole 
()fE2,, A gE a, 

GfEQAGEX,, 


- 159 - 


: Fig. 22. The Standard Alphabet Ordering for Homomorphism Specification 


Notations 


2y is the alphabet of the homomorphism specification 
a is the standard alphabet ordering on 2H 


Definition 


f a g if and only if one of the following conditions holds: 


(1) f is the symbol! 36, and g is any other function symbol in Ly 
(2) f is an auxiliary function symbol, and g is a generator function symbol 


one needs to use the fact that > contains the subterm relation in doing so. The ordering cannot, 
however, be used for Array_Int specification. The ordering can be used for a subset of the 
specification that is used in examples to illustrate the synthesis procedure. A lexicographic recursive 
path ordering based on the standard alphabet ordering of Fig. 22 can be used all of the sample 
homomorphism specifications given in the last chapter. 


Lexicographic recursive path orderings are useful in defining termination ordering for a rewriting 
system that is built from two or more rewriting systems that have recursive path orderings already 
defined on them. Suppose >, and >-, are two recursive path orderings defined with respect to the 


well-founded alphabet orderings >, and >,, respectively. Suppose R,, and R, are two systems for 
which >-, and >-, can serve as termination orderings. Then the recursive path ordering that is based 
on >, U>, can be used as a termination ordering for the system R, UR, provided >, U >, is 


well-founded. The standard alphabet ordering is such that the union of any two of them (defined on 
mutually exclusive alphabets) preserves the well-foundedness property. Hence it is useful in the 
context of combining systems of rewriting systems. 
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Appendix III - Proofs of Theorems | 


Theorem 6 
Let S be a system that satisfies the aeeeiie of definition. Let e, = e, be an equation so that e, ande, 


have at least one nongenerator function symbol in them. Then, e, = e, is a theorem of S if 
SU fe, — e,} also satisfies the principle of definition. 


Proof The proof is by contradiction. Let us assume that S U {e, > e,} satisfies the principle of 
definition, but e, = e, is not a theorem of S. 

If e, = e, is not a theorem of S, then there exists a substitution o that maps variables to 
generator constants so that o(e,) and o(e,) have distinct normal forms in S. Since S satisfies the 
principle of definition, o(e,) and ofe,) have unique normal forms that are generator constants; let the 
normal forms be t, and t,, respectively (t, # t,). Note that ofe,) and sie2 are distinct from t, and t, 


respectively because the latter two are generator constants while the former two are not. Therefore, in 
the system S U {e, + e,} we have the following situation: 


ofe,) > o(e,) —*t, ofe,) + at and t, ¥ t.. 


Thus, S U {e, — e,} violates the principle of definition. Contradiction. 
Q.E.D. 


Theorem 7 
PW is a Perturbed World. Suppose 


(1) e, is an expression so that for every substitution o of variables to generator constants o (e,) 


is reducible using PW, and 
(2) PWU {e, — e,} is convergent. 

Then, €, = e, is a theorm of PW. 
Proof PW is convergent. Therefore, to show that e, = e, is a theorem of PW, we have to show that 
for every substitution o of the variables in e, and €, by generator terms of the appropriate type, o (e,) 
and o(e,) have the same normal forms. . 
The proof is by contradiction. Let us suppose that PW U {e, ad e,} is convergent, but _ 
e, = @, is not a theorem of PW. This means, there exists a o such that t, = o(e,)) and t, = o(e,)+ 
are distinct. By the second premise of the theorem, therefore, we have the following sidaline in PW U 
{e,— e,} 

a (e,) << —*t, 

a(e,)+* t,andt, # t, 

Therefore, PW U {e, —+ e,} is not convergent. Notice the need for the second premise. If 
we did not have this premise o(e,) could be identical to t,, in which case PWU {e, —+ €,} is still 
convergent. , 
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Q.E.D. 


Theorem 8 ‘ 
A rewriting system R satisfies the principle of definition if it satisfies the following conditions: 


(i) R is well-spanned. 
(2) R has FTP. 


(3) Every critical pair <a,, a> of R is such that a, =a, is a theorem of R. 


Proof The first two conditions ensure that every constant in R has at least one normal form, and that 
every normal form is a gencrator constant. The following argument shows that every constant has a 
unique normal form. 


The proof is by contradiction. Suppose there exists a constant that has two distinct normal form. 
Then, according to the KB-theorem there exists a noncnvergent critical pair. This contradicts the third 


condition in the statement of the theorem. Contradiction. 
Q.E.D. 
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